Problem:
When you specify a text file as the CHARTrunner data source, the data from the text file will not be read correctly if the format of the text file is “nonstandard”. In CHARTrunner, the symptom of this condition is incorrect column information shown on the “Data definition” tab of the chart definition.
By default, the Jet database engine (which is used for reading text files) expects text files to specify field names on the first row, and subsequent rows to contain data in “Comma separated values (CSV)” format. For example, CHARTrunner is able to properly read the following MyCsvData.txt file:
Filename = C:\MyTextData\MyCsvData.txt
Date,Value1,Value2,Value3
1/1/2000,11.6,12.8,12.4
1/2/2000,12.3,12.1,11.9
1/3/2000,15.6,12.3,12.4
However, CHARTrunner is unable to properly read the following MyTabData.txt file because it uses the Tab character (shown as <Tab>) as the field delimiter:
Filename = C:\MyTextData\MyTabData.txt
Date<Tab>Value1<Tab>Value2<Tab>Value3
1/1/2000<Tab>11.6<Tab>12.8<Tab>12.4
1/2/2000<Tab>12.3<Tab>12.1<Tab>11.9
1/3/2000<Tab>15.6<Tab>12.3<Tab>12.4
Also, CHARTrunner is unable to properly read the following NoHeaderRow.txt file because the first row does not contain the field names:
Filename = C:\MyTextData\NoHeaderRow.txt
1/1/2000,11.6,12.8,12.4
1/2/2000,12.3,12.1,11.9
1/3/2000,15.6,12.3,12.4
Solution:
The solution is to create a text file (using NotePad or another text editor) named Schema.ini in the same folder as the text file(s) you are using as a data source. The
Schema.ini file provides the Jet database engine with information about
how to properly interpret the contents of a text file.
For example, CHARTrunner is able to properly
interpret the contents of MyTabData.txt and NoHeaderRow.txt by creating
the following C:\MyTextData\Schema.ini file as follows:
[MyTabData.txt]
ColNameHeader=True
Format=TabDelimited
[NoHeaderRow.txt]
ColNameHeader=False
Format=CSVDelimited
Col1=Date DateTime
Col2=Value1 Single
Col3=Value2 Single
Col4=Value3 Single
Note that since MyCsvData.txt is already in a standard format it is not necessary to add an entry to Schema.ini for that file.
Format of the Schema.ini File
The
following information about the Schema.ini file was taken from
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetsdk_98.asp.
The schema information file, which is always named Schema.ini
and always kept in the same directory as the text data source, provides
the Jet database engine with information about the general format of
the text file, the column name and data type information, and a number
of other data characteristics. A Schema.ini file is always required for
accessing fixed-length data; you should use a Schema.ini file when your
text table contains DateTime, Currency, or Decimal data or any time you
want more control over the handling of the data in the table.
Understanding Schema.ini Files
Schema.ini files provide schema information about the records in a
text file. Each Schema.ini entry specifies one of five characteristics
of the table:
· The text file name
· The file format
· The field names, widths, and types
· The character set
· Special data type conversions
· The following sections discuss these characteristics.
· Specifying the File Name
The first entry in Schema.ini is always the name of the text source
file enclosed in square brackets. The following example illustrates the
entry for the file MyTabData.txt:
[MyTabData.txt]
Specifying the File Format
The Format option in Schema.ini specifies the format of the text file. The Jet database engine can read the format automatically from most character-delimited files. You can use any single character as a delimiter in the file except the double quotation mark ("). The following table lists the valid values for the Format option.
|
Format specifier |
Table format |
Schema.ini Format statement |
|
Tab Delimited |
Fields in the file are delimited by tabs. |
Format=TabDelimited |
|
CSV Delimited |
Fields in the file are delimited by commas (comma-separated values). |
Format=CSVDelimited |
|
Custom Delimited |
Fields
in the file are delimited by any character you choose to input into the
dialog box. All except the double quote (") are allowed, including
blank. |
Format=Delimited(custom character) -or- With no delimiter specified: Format=Delimited( ) |
|
Fixed Length |
Fields in the file are of a fixed length. |
Format=FixedLength |
Specifying the Fields
Specify field names in a character-delimited text file in two ways:
1. Include the field names in the first row of the table and set ColNameHeader to True.
2. Specify each column by number and designate the column name and data type.
You must specify each column by number and designate the column name, data type, and width for fixed-length files.
The data types of the fields can also be determined. Use the MaxScanRows option to indicate how many rows should be scanned when determining the column types. If you set MaxScanRows to 0, the entire file is scanned.
The following entry indicates that Jet database engine should use
the data in the first row of the table to determine field names and
should examine the entire file to determine the data types used:
ColNameHeader=True
MaxScanRows=0
The next entry designates fields in a table by using the column number (Coln)
option, which is optional for character-delimited files and required
for fixed-length files. The example shows the Schema.ini entries for
two fields, a 10-character CustomerNumber text field and a 30-character
CustomerName text field:
Col1=CustomerNumber Text Width 10
Col2=CustomerName Text Width 30
The syntax of Coln is:
Coln=ColumnName type [Width #]
The following table describes each part of the Coln entry.
|
Parameter |
Description |
|
ColumnName |
The text name of the column. If the column name contains embedded spaces, you must enclose it in double quotation marks. |
|
type |
Data types are: Byte |
|
Width |
The literal string value Width. Indicates that the following number designates the width of the column (optional for character-delimited files; required for fixed-length files). |
|
# |
The integer value that designates the width of the column (required if Width is specified). |
Selecting a Character Set
You can select from two character sets: ANSI and OEM. The following
example shows the Schema.ini entry that sets the character set to ANSI:
CharacterSet=ANSI
Specifying Data Type Formats and Conversions
The Schema.ini file contains a number
of options that you can use to specify how data is converted or
displayed. The following table lists each of these options.
|
Option |
Description |
|
DateTimeFormat |
Can be set to a format string indicating dates and times. You should specify this entry if all date/time fields in the import/export are handled with the same format. All Microsoft Jet formats except A.M. and P.M. are supported. In the absence of a format string, the Windows Control Panel short date picture and time options are used. |
|
DecimalSymbol |
Can be set to any single character that is used to separate the integer from the fractional part of a number. |
|
NumberDigits |
Indicates the number of decimal digits in the fractional portion of a number. |
|
NumberLeadingZeros |
Specifies
whether a decimal value less than 1 and greater than –1 should contain
leading zeros; this value can either be False (no leading zeros) or
True. |
|
CurrencySymbol |
Indicates the currency symbol to be used for currency values in the text file. Examples include the dollar sign ($) and Dm. |
|
CurrencyPosFormat |
Can be set to any of the following values: · Currency symbol prefix with no separation ($1) · Currency symbol suffix with no separation (1$) · Currency symbol prefix with one character separation ($ 1) · Currency symbol suffix with one character separation (1 $) |
|
CurrencyDigits |
Specifies the number of digits used for the fractional part of a currency amount. |
|
CurrencyNegFormat |
Can be one of the following values: · ($1) · –$1 · $–1 · $1– · (1$) · –1$ · 1–$ · 1$– · –1 $ · –$ 1 · 1 $– · $ 1– · $ –1 · 1– $ · ($ 1) · (1 $) This example shows the dollar sign, but you should replace it with the appropriate CurrencySymbol value in the actual program. |
|
CurrencyThousandSymbol |
Indicates the single-character symbol to be used for separating currency values in the text file by thousands. |
|
CurrencyDecimalSymbol |
Can be set to any single character that is used to separate the whole from the fractional part of a currency amount. |