PQ Systems Knowledge Base

CR: Using a Text File as a Data Source

Problem:

When you specify a text file as the CHARTrunner data source, the data from the text file will not be read correctly if the format of the text file is “nonstandard”.  In CHARTrunner, the symptom of this condition is incorrect column information shown on the “Data definition” tab of the chart definition.

By default, the Jet database engine (which is used for reading text files) expects text files to specify field names on the first row, and subsequent rows to contain data in “Comma separated values (CSV)” format.  For example, CHARTrunner is able to properly read the following MyCsvData.txt file:

Filename = C:\MyTextData\MyCsvData.txt

Date,Value1,Value2,Value3

1/1/2000,11.6,12.8,12.4

1/2/2000,12.3,12.1,11.9

1/3/2000,15.6,12.3,12.4

However, CHARTrunner is unable to properly read the following MyTabData.txt file because it uses the Tab character (shown as <Tab>) as the field delimiter:

Filename = C:\MyTextData\MyTabData.txt

Date<Tab>Value1<Tab>Value2<Tab>Value3

1/1/2000<Tab>11.6<Tab>12.8<Tab>12.4

1/2/2000<Tab>12.3<Tab>12.1<Tab>11.9

1/3/2000<Tab>15.6<Tab>12.3<Tab>12.4

Also, CHARTrunner is unable to properly read the following NoHeaderRow.txt file because the first row does not contain the field names:

Filename = C:\MyTextData\NoHeaderRow.txt

1/1/2000,11.6,12.8,12.4

1/2/2000,12.3,12.1,11.9

1/3/2000,15.6,12.3,12.4

Solution:

The solution is to create a text file (using NotePad or another text editor) named Schema.ini in the same folder as the text file(s) you are using as a data source.  The Schema.ini file provides the Jet database engine with information about how to properly interpret the contents of a text file. 

For example, CHARTrunner is able to properly interpret the contents of MyTabData.txt and NoHeaderRow.txt by creating the following C:\MyTextData\Schema.ini file as follows:

[MyTabData.txt]

ColNameHeader=True

Format=TabDelimited

 

[NoHeaderRow.txt]

ColNameHeader=False

Format=CSVDelimited

Col1=Date DateTime

Col2=Value1 Single

Col3=Value2 Single

Col4=Value3 Single

Note that since MyCsvData.txt is already in a standard format it is not necessary to add an entry to Schema.ini for that file.

Format of the Schema.ini File

The following information about the Schema.ini file was taken from http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetsdk_98.asp.

The schema information file, which is always named Schema.ini and always kept in the same directory as the text data source, provides the Jet database engine with information about the general format of the text file, the column name and data type information, and a number of other data characteristics. A Schema.ini file is always required for accessing fixed-length data; you should use a Schema.ini file when your text table contains DateTime, Currency, or Decimal data or any time you want more control over the handling of the data in the table.

Understanding Schema.ini Files

Schema.ini files provide schema information about the records in a text file. Each Schema.ini entry specifies one of five characteristics of the table:

·        The text file name

·        The file format

·        The field names, widths, and types

·        The character set

·        Special data type conversions

·        The following sections discuss these characteristics.

·        Specifying the File Name

The first entry in Schema.ini is always the name of the text source file enclosed in square brackets. The following example illustrates the entry for the file MyTabData.txt:

[MyTabData.txt]

Specifying the File Format

The Format option in Schema.ini specifies the format of the text file. The Jet database engine can read the format automatically from most character-delimited files. You can use any single character as a delimiter in the file except the double quotation mark (").  The following table lists the valid values for the Format option.

Format specifier

Table format

Schema.ini Format statement

Tab Delimited

Fields in the file are delimited by tabs.

Format=TabDelimited

CSV Delimited

Fields in the file are delimited by commas (comma-separated values).

Format=CSVDelimited

Custom Delimited

Fields in the file are delimited by any character you choose to input into the dialog box. All except the double quote (") are allowed, including blank.

Format=Delimited(custom character)

-or-

With no delimiter specified:

Format=Delimited( )

Fixed Length

Fields in the file are of a fixed length.

Format=FixedLength

Specifying the Fields

Specify field names in a character-delimited text file in two ways:

1.      Include the field names in the first row of the table and set ColNameHeader to True.

2.      Specify each column by number and designate the column name and data type.

You must specify each column by number and designate the column name, data type, and width for fixed-length files.

The data types of the fields can also be determined. Use the MaxScanRows option to indicate how many rows should be scanned when determining the column types. If you set MaxScanRows to 0, the entire file is scanned.

The following entry indicates that Jet database engine should use the data in the first row of the table to determine field names and should examine the entire file to determine the data types used:

ColNameHeader=True

MaxScanRows=0

The next entry designates fields in a table by using the column number (Coln) option, which is optional for character-delimited files and required for fixed-length files. The example shows the Schema.ini entries for two fields, a 10-character CustomerNumber text field and a 30-character CustomerName text field:

Col1=CustomerNumber Text Width 10

Col2=CustomerName Text Width 30

The syntax of Coln is:

Coln=ColumnName type [Width #]

The following table describes each part of the Coln entry.

Parameter

Description

ColumnName

The text name of the column. If the column name contains embedded spaces, you must enclose it in double quotation marks.

 

type

Data types are:

Byte
Short
Long
Currency
Single
Double
DateTime
Text

Width

The literal string value Width. Indicates that the following number designates the width of the column (optional for character-delimited files; required for fixed-length files).

 

#

The integer value that designates the width of the column (required if Width is specified).

Selecting a Character Set

You can select from two character sets: ANSI and OEM. The following example shows the Schema.ini entry that sets the character set to ANSI:

CharacterSet=ANSI

Specifying Data Type Formats and Conversions

The Schema.ini file contains a number of options that you can use to specify how data is converted or displayed. The following table lists each of these options.

Option

Description

DateTimeFormat

Can be set to a format string indicating dates and times. You should specify this entry if all date/time fields in the import/export are handled with the same format. All Microsoft Jet formats except A.M. and P.M. are supported. In the absence of a format string, the Windows Control Panel short date picture and time options are used.

 

DecimalSymbol

Can be set to any single character that is used to separate the integer from the fractional part of a number.

 

NumberDigits

Indicates the number of decimal digits in the fractional portion of a number.

 

NumberLeadingZeros

Specifies whether a decimal value less than 1 and greater than –1 should contain leading zeros; this value can either be False (no leading zeros) or True.

CurrencySymbol

Indicates the currency symbol to be used for currency values in the text file. Examples include the dollar sign ($) and Dm.

 

CurrencyPosFormat

Can be set to any of the following values:

·                     Currency symbol prefix with no separation ($1)

·                     Currency symbol suffix with no separation (1$)

·                     Currency symbol prefix with one character separation ($ 1)

·                     Currency symbol suffix with one character separation (1 $)

CurrencyDigits

Specifies the number of digits used for the fractional part of a currency amount.

 

CurrencyNegFormat

Can be one of the following values:

·                     ($1)

·                     –$1

·                     $–1

·                     $1–

·                     (1$)

·                     –1$

·                     1–$

·                     1$–

·                     –1 $

·                     –$ 1

·                     1 $–

·                     $ 1–

·                     $ –1

·                     1– $

·                     ($ 1)

·                     (1 $)

This example shows the dollar sign, but you should replace it with the appropriate CurrencySymbol value in the actual program.

CurrencyThousandSymbol

Indicates the single-character symbol to be used for separating currency values in the text file by thousands.

 

CurrencyDecimalSymbol

Can be set to any single character that is used to separate the whole from the fractional part of a currency amount.

 

Would you like to...

Print this page Print this page

Email this page Email this page

Post a comment Post a comment

Subscribe me

Add to favorites Add to favorites

Remove Highlighting Remove Highlighting

Edit this Article

Quick Edit

Export to PDF

User Opinions (1 vote)

100% thumbs up 0% thumbs down

How would you rate this answer?



Thank you for rating this answer.

Related Articles

No related articles were found.

Attachments

No attachments were found.

Continue