Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel

0
votes
1answer
5 views

Reset hadoop aws keys to upload to another s3 bucket under different username

Sorry for horrible question title but here is my scenario I have a pyspark databricks notebook in which I am loading other notebooks. One of this notebooks is setting some redshift configuration for ...
0
votes
1answer
10 views

Scheduler for materialized views Postgresql + Redshift

What I want is to update a table every night and cache it so it doesn't have to run each time we run a query based on it. So I figure I need a materialised view (not a view). Top answer to below ...
-3
votes
0answers
20 views

redshift creating a table nvalid operation: syntax error at or near “-” Position: 522;

I have spent hours looking at my scripts I am not sure where I am wrong creating a table into redshit
0
votes
1answer
18 views

How can I access data in s3 bucket from one account to process the data using redshift in another account?

There is huge data in one of my AWS account in S3 Bucket and I want to process the data in some another AWS account using Redshift, I want to save the cost of data transfer and storage since I already ...
0
votes
0answers
18 views

Unload a table from redshift to S3 in parquet format without python script

I Found that we can use spectrify python module to convert a parquet format but i want to know which command will unload a table to S3 location in parquet format. one more thing i found that we can ...
1
vote
3answers
13 views

Redshift - Error when converting UTC time to local time in where clause

I have some sales data that is recorded in UTC. I am trying to convert it to the local timezone where the sales happened. I have built up a query as below but get an error saying invalid operation: ...
1
vote
1answer
19 views

Incremenrtal load s3 folder files

What is easy way to apply the incremental load in s3 folder files using python? Date is taken from the "filename_180828_152153" like this! I have tried insert the all filenames and dates to the table....
0
votes
1answer
18 views

Can I alias an external table in redshift to remove the schema name?

We want to migrate tables to Spectrum, which requires defining an external schema create external schema spectrum from data catalog database 'spectrumdb' iam_role 'my_iam_role' create external ...
0
votes
2answers
19 views

Count of unique columns based on another row

If you have the following table: ID, item 1, A 2, A 1, A 1, B 3, C I would like to get these results ID, A, B, C 1, 2, 1, 0 2, 1, 0, 0 3, 0, 0, 1 There should be a column for each item type. In a ...
0
votes
1answer
29 views

extract time from now() in amazon redshift

I am trying to extract the time part only (excluding milliseconds) from the date-time string in amazon redshift. I used: Select now()::time; But it is giving me error. Error running query: ...
0
votes
1answer
46 views

cross join to get all dates and hours and avoid duplicate values

We have 2 tables: sales hourt (only 1 field (hourt) of numbers: 0 to 23) The goal is to list all dates and all 24 hours for each day and group hours that have sales. For hours that do not have sales,...
0
votes
1answer
24 views

“Error parsing the type of column” Redshift Spectrum

I have a use case for spectrum using files a large amount of json files from s3. I started by crawling the data using a Glue crawler to create a data catalog. Then with that catalog I created an ...
0
votes
1answer
15 views

Failed to inflateinvalid or incomplete deflate data. zlib error code -3

I am trying to upload data on redshift using s3. The file from which data is to be copied is in csv format (say named users.csv). I run following command copy user.dimension_users from 's3://<...
2
votes
0answers
32 views

Finding where a variable is stored in Flyway

I'm running Flyway using Maven, Redshift database and this is what I get: mvn -f postgresql.pom.xml clean compile flyway:migrate -Puat -Denv=xxx -Dxxx.flyway.properties=uat.flyway.properties [...
0
votes
2answers
19 views

Redshift - Finding number of times a flag appears for a particular ID

I have some sales data that shows if a bill has been generated for a customer. The column labelled bill_generated returns 'Y' if a bill has been generated else its blank. I am trying to find the list ...
0
votes
0answers
23 views

Power BI - Redshift odbc driver

We are working on power bi connected to a redshift cluster using the default driver that comes within power bi, the simba driver odbc. We are experiencing really poor performance of queries comparing ...
0
votes
1answer
68 views
+50

Create a lapsed concept based on logic across every row per ID

I am trying to get to a lapsed_date which is when there are >12 weeks (ie. 84 days) for a given ID between: 1) onboarded_at and current_date (if no applied_at exists) - this means lapsed_now if >84 ...
0
votes
2answers
21 views

Incremental Data Storage

I have time series daily data which I run a model on. The model runs in Spark. I only want to run the model daily, and append the results to the historic results. It is important to have a 'merged ...
0
votes
2answers
22 views

Redshift - Converting UTC data to other timezones

I am trying to convert data from UTC to various European timezones. I am using a case statement and find only the first condition in the case statement is executed while the other conditions are not ...
1
vote
1answer
29 views

How to compare multiple columns with other table in MySql

I have below mentioned two tables: Table1: ID Value1 Value2 Value3 Remarks RTE-10 2400.00 1.5 2300 Processed RTE-11 1300.00 1.8 1750 ...
0
votes
0answers
23 views

Netezza In-Built AGE function as UDF in Redshift

I'm trying to Implement Netezza AGE function in Redshift as a UDF. I can able to get the correct answer in Python (Spyder IDE - Py 3.6) but when I execute it in Redshift as UDF, it gives me incorrect ...
0
votes
2answers
35 views

Duplicate column name error when creating a table from a select

I have a join on 3 tables with a large number of fields. Post join keep getting an error due to duplicate column names. Using alias has not solved the problem for me. Is there a way of dropping the ...
0
votes
2answers
35 views

Fetch products with same price for last x days

I would like to fetch those products that are reporting the same price for the last 5 days consecutively in MYSQL. I have attached sample data for your reference below. PID Price Date P1 10 25-...
1
vote
0answers
43 views

Redshift Exceptions on Insert

Using redshift jdbc driver: 1.2.16.1027 I routinely get exceptions such as: java.sql.SQLNonTransientException: [Amazon][JDBC](10900) Not all parameters have been populated. at com.amazon....
0
votes
0answers
19 views

How do I COPY a nested Avro field to Redshift as a single field?

I have the following Avro schema for a record, and I'd like to issue a COPY to Redshift: "fields": [{ "name": "id", "type": "long" }, { "name": "date", "type": { "type": "...
0
votes
1answer
41 views

Large CSV file with Tableau Desktop

I've a 100GB CSV file (200 million rows X 60 columns) which I'm using to create dashboards in Tableau Desktop via extract. I've been facing a performance issue and it takes about 2 minutes to refresh ...
-3
votes
0answers
25 views

[JDBC Driver]SAML error: sun.security.validator.ValidatorException:

Iam getting the below error when trying to access redshift cluster through SQL/j workbench [JDBC Driver]SAML error: sun.security.validator.ValidatorException: PKIX path building failed: sun.security....
0
votes
3answers
42 views

Count distinct multiple columns in redshift

I am trying to count rows which have a distinct combination of 2 columns in Amazon redshift. The query I am using is - select count(distinct col1, col2) from schemaname.tablename where some filters ...
-2
votes
0answers
13 views

reRun aborted COPY commands in RedShift

I'm trying to find an easy way to rerun some aborted COPY queries on RedShift but I didn't find any straight way to do that, so can you please help describing how to do that ? Query execution ...
1
vote
2answers
67 views

Generate time series with date and hour and create table in Amazon Redshift

I'd like to generate a table of dates and hours in Amazon Redshift. The following query would work in Postgresql 9.1 and above but unfortunately we're limited to Amazon Redshift, which resulted in ...
0
votes
2answers
24 views

Add a column to populate rank for every group

I've history data with account details, where the account activity status is either 'Active' or 'Cancelled'.When the account is re-opened the account status becomes 'Active' and later can become '...
0
votes
2answers
44 views

invalid input syntax for type numeric: “ ”

I'm getting this message in Redshift: invalid input syntax for type numeric: " " , even after trying to implement the advice found in SO. I am trying to convert text to number. In my inner join, I ...
0
votes
1answer
16 views

pivot rows to columns in redshift dynamically

I have a Redshift table with different cases, company and date(timestamp) I used the following query to aggregate number of cases per company by month: SELECT DATE_TRUNC('MONTH', case_date)...
0
votes
1answer
23 views

Redshift - Using output from a nested query as input to another query

I have a nested query that goes like with q1 as (select * from table1), q2 as (select * from table2) select q1.col1, q2.col2,listagg(q1.id || ' - ' || q2.id, ', ') as unique_id from q1 join q2 ...
0
votes
1answer
42 views

How to get rows within specific sequence of value?

I need to get sessions where I one sequence of values in specific order. Now I have this query which returns sessions for each user from raw data select user_id, page, happened_at from db as u1 ...
0
votes
1answer
42 views

How to get user sessions?

I'm working with redash and I need to get user rows where for each row delta between date field is less than hour. In more details: I need a session, user activity where it has some actions where ...
-2
votes
1answer
32 views

AWS : Redshift vs traditional dbms [closed]

In traditional DBMS when we try to execute a insert into table and delete from the same table (delete just the data not table) I remember it will result in a deadlock. With Redshift when I delete ...
2
votes
1answer
19 views

Redshift COPY command raises error if S3 prefix does not exist

When I run this COPY command: COPY to_my_table (field1, field2, etc) FROM s3://my-service-f55b83j5vvkp/2018/09/03 CREDENTIALS 'aws_iam_role=...' JSON 'auto' TIMEFORMAT 'auto'; I get this error: The ...
0
votes
0answers
14 views

create table fails to see schema created in same execute() call

I'm attempting to execute the following against a Redshift database: create schema foo; create table foo.bar(baz integer); ...using cursor.execute(). The create table statement fails, complaining ...
0
votes
1answer
27 views

AWS DMS and Redshift

I am using DMS to migrate data from MySQL to Redshift. Inside DMS, I use the 'full-load-and-ongoing' option to load data to Redshift. Assuming the full-load is complete and the on-going is in progress,...
0
votes
1answer
29 views

Syntax error near “from” in amazon redshift query

I am working on queries in Amazon RedShift. I have created in query in which I am getting syntax error but I am unable to know the issue behind it. The query looks OK to me. Below is the query: ...
0
votes
1answer
45 views

LAG SQL operation not working in Amazon Redshift

I keep getting the error: Syntax Error: at or near "," Line: 4 Position: 3 on Periscope when trying to create tracking for sessions from Amazon Redshift. The LAG operation seems to throw the error ...
-1
votes
1answer
16 views

Redshift WLM - how are unused slots used

My understanding of WLM in Redshift is that there are queues and each queue has slots (by default 5). And let's say the system is idle, and I run a query. By default it occupies one slot and runs the ...
3
votes
1answer
32 views

WHERE IN with multiple columns in Redshift

I am using Amazon Redshift where I have two tables. A staging table where I COPY all data from S3 and a target table where everything should eventually be inserted. Now I have query that should ...
0
votes
2answers
38 views

What is the max size for a Redshift insert query?

I am trying to batch multiple rows of data into a Redshift INSERT query. In order to keep it efficient, I want to know the largest length I can go before I need to start a new batch. If there is a ...
1
vote
0answers
20 views

Load French characters in Amazon Redshift from S3

I have a .csv file which contains French characters. The data is like this: Antonie Bégarder,12345,6789,France The file got loaded to S3. I am using COPY command to load the file from S3 to Redshift. ...
1
vote
3answers
28 views

Alternatives of array_agg() or string_agg() on redshift

I am using this query to get the aggregated results: select _bs, string_agg(_wbns, ',') from bag group by 1; I am getting this error: Error running query: function string_agg(character varying, "...
0
votes
1answer
22 views

Invalid operation: function pg_catalog.pgdate_part(“unknown”, text) does not exist;

I have sql script for redshift Here is part of code SELECT clo.name AS SalesManager, prospect.id AS ProspectId, prospect.fullname AS ProspectName, prospect.company, Prospect.Email, ...
1
vote
0answers
38 views

Redshift external table from parquet file showing null values for string data type

I am creating an external table in Redshift, pointing at a Parquet file stored in S3. The parquet file is created with pyarrow. When I SELECT * the external table defined below the "timestamp" works ...
0
votes
3answers
43 views

AWS Glue to Redshift: duplicate data?

Here are some bullet points in terms of how I have things setup: I have CSV files uploaded to S3 and a Glue crawler setup to create the table and schema. I have a Glue job setup that writes the data ...