Reading File using Sequential File Stage

Home > Datastage-Stages > Reading File using Sequential File Stage

Reading File using Sequential File Stage

April 26, 2011 ukatru Leave a comment Go to comments

Sequential File:

The Sequential File stage is a file stage. It allows you to read data from or write
data to one or more flat files as shown in Below Figure:

The stage executes in parallel mode by default if reading multiple files but executes sequentially if it is only reading one file.

In order read a sequential file datastage needs to know about the format of the file.

If you are reading a delimited file you need to specify delimiter in the format tab.

Reading Fixed width File:

Double click on the sequential file stage and go to properties tab.

Source:

File:Give the file name including path

Read Method:Whether to specify filenames explicitly or use a file pattern.

Important Options:

First Line is Column Names:If set true, the first line of a file contains column names on writing and is ignored on reading.

Keep File Partitions:Set True to partition the read data set according to the organization of the input file(s).

Reject Mode: Continue to simply discard any rejected rows; Fail to stop if any row is rejected; Output to send rejected rows down a reject link.

For fixed-width files, however, you can configure the stage to behave differently:
* You can specify that single files can be read by multiple nodes. This can improve performance on cluster systems.
* You can specify that a number of readers run on a single node. This means, for example, that a single file can be partitioned as it is read.

These two options are mutually exclusive.

Scenario 1:

Reading file sequentially.

Scenario 2:

Read From Multiple Nodes = Yes

Once we add Read From Multiple Node = Yes then stage by default executes in Parallel mode.

If you run the job with above configuration it will abort with following fatal error.

sff_SourceFile: The multinode option requires fixed length records.(That means you can use this option to read fixed width files only)

In order to fix the above issue go the format tab and add additions parameters as shown below.

Now job finished successfully and please below datastage monitor for performance improvements compare with reading from single node.

Scenario 3:Read Delimted file with By Adding Number of Readers Pernode instead of multinode option to improve the read performance and once we add this option sequential file stage will execute in default parallel mode.

If we are reading from and writing to fixed width file it is always good practice to add APT_STRING_PADCHAR Datastage Env variable and assign 0x20 as default value then it will pad with spaces ,otherwise datastage will pad null value(Datastage Default padding character).

Always Keep Reject Mode = Fail to make sure datastage job will fail if we get from format from source systems.

Categories: Datastage-Stages

Comments (6) Trackbacks (0) Leave a comment Trackback

sandeep

May 9, 2011 at 9:20 am

Reply

For the fixed width format, we should be specify the delimiter type can you please mention what would be the delimiter
- ukatru
  
  May 9, 2011 at 8:19 pm
  
  Reply
  
  We don’t need to specify delimter to read fixed width file.
  
  Thanks
  Uma
Catherine

October 6, 2011 at 8:43 pm

Reply

Hello,
Could you please help.
I have this input file which is fixed width.
000001 Baggins Bilbo 20090811
000002 Baggins Frodo 20090801
000003 Gamgee Samwise 20090820
000004 Cotton Rosie 20090821

I am configuring the data type as VarChar with length 6, 22, 10 and 8.
However, when I view data/output data from this input file, only the first column being recognized.
Could you please share what I could have done wrong?

Thank you very much.
- ukatru
  
  November 21, 2011 at 8:56 pm
  
  Reply
  
  If you are reading fixed width file please use char as datatype,then it should work.
David

August 14, 2012 at 11:11 am

Reply

Hi Uma,
Could you do a post showing an example of using CoSort’s external SortCL (flat-file transform) utility called as sequential file stage? We’d like to offer a web and blog link about it since we’re frequently asked for an illustrated example. See Solution 2) at http://www.iri.com/solutions/ETL_DB_Acceleration/DataStage where this is mentioned, and let me know what you’d need to illustrate it on your side please. Thanks a lot,
David
- ukatru
  
  September 1, 2012 at 6:36 pm
  
  Reply
  
  Hi Idon’t have co sort software installed and also there is no trail version to show an example.
  
  Thanks
  Uma

No trackbacks yet.

InfoSphere DataStage – IBM