Home Introduction Capabilities News Software Examples FAQ Standard Licence Contact

Fielded Text:

Introduction

"Fielded Text" trademark owned by Paul Klink

What is the Fielded Text standard all about?

The ultimate aim of the Fielded Text Standard is to make it a lot easier to use text files to store or distribute tables of values.

Currently files such as Comma Separated Value (CSV) text files are often used to transfer this type of data.  Their advantage is that they are easy to understand, can be viewed with a plain text editor, and programmers can access the data without any special run-times.  Their disadvantage is that everyone uses different structures and formats for storing their data in text files.  Hence programmers invariably have to write a new parser whenever they want to extract data from a text file from a new supplier.

Fielded Text takes a new approach when working with text files containing tables of values!

Fielded Text allows you to associate a Meta File with a text file.  This Meta file is a small XML file which describes the structure and format of data within the text file.  More importantly, it allows you to access the data in the file in the same way you access records in a database table.  It combines the simplicity of text files with the convenience of database access.

A full description of how the Fielded Text standard enables this can be found in the Standard page on this website.  However if you are looking for a concise description, read on.

Capabilities

The Fielded Text standard aims to be compatible with all text files today that contain tables of values.  In support of this aim, it has a set of capabilities which should cover nearly all structures and formats used in existing text files.  These capabilities are summarised here.

Creating Meta Files

There are several ways of creating a Meta file for a FieldedText file:

  1. Use a text editor.  Meta files are XML files so, provided you know the tags and schema, you can use any text editor (or XML file editor) to create it.
  2. Use a Fielded Text Editor.  This is the easy way.  A Fielded Text Editor is a specialised text editor for working with Fielded Text files.  With a Fielded Text Editor you can load a sample text file and then visually set the Meta properties for it.  The editor will interactively show you any parsing errors arising from incorrect properties.  Once the properties are correctly set, you can export them into a Meta File.

    A list of Fielded Text editors is here - including at least one free one.
  3. Programatically.  Fielded Text software components can also be used to generate Meta files programatically.  This method will be used in specialised Fielded Text applications (for example, Fielded Text Editors).

Meta files typically have a file extension of "ftm".

Declared and Undeclared Fielded Text Files

Fielded Text files can be either declared or not declared (undeclared).

A declared Fielded Text file has 2 special lines at the start of the file.  These 2 lines are called the file declaration.  The declaration contains a marker which identifies the file as a Fielded Text file and it specifies what version of the Fielded Text Standard the file conforms to.

In addition, the declaration can also specify the Meta file which this text file is associated with.  It can specify the Meta with:

Declared Fielded Text files remove the need for end users to match a text file with the Meta.  This makes them more reliable as users are less likely to parse or interpret the data incorrectly.  For example, organisations publishing data, can place a copy of the Meta on their website and have the downloadable data text files reference it.  End users can then simply download the text file and the parser will automatically know how to obtain the correct Meta.

Nearly all existing text files containing tables of values will be 'Undeclared' Fielded Text files.  They can be handled exactly the same way as Declared Fielded Text files however the text file will need to be explicitly associated with its Meta.

Basic Example

Below is a basic CSV file. It has 2 heading lines and 4 data lines. The lines contain 7 fields of various types.

Attribute

Description

Default

Culture

Specifies which regional conventions should be used. (RFC 4646)

Invariant

EndOfLineType

Method used to detect line ends in text file

Auto

EndOfLineChar

Character which denotes line end when EndOfLineType is "Char"

;

EndOfLineAutoWriteType

Method used to write line ends when EndOfLineType = "Auto"

Local

LastLineEndedType

Whether last line is terminated with End of Line character(s)

Optional

EndOfLineIsSeparator

End Of Line separates records instead of terminating records (deprecated in Version 1.1.  Use LastLineEndedType instead)

False.

QuoteChar

Character used to quote a field value (ie enclose field value)

"

DelimiterChar

Character which separates fields in a line

,

LineCommentChar

Character which, if it's first in line, denotes that line is a comment

0x04

StuffedEmbeddedQuotes

Quotes can be embedded in a quoted field by having 2 in a row

True.

SubstitutionEnabled

Enables substitutions in the text

False.

SubstitutionChar

Character which identifies a substitution

\

AllowEndOfLineCharInQuotes

Allow End of Line character(s) within a quoted string

True.

IgnoreBlankLines

Ignore blank lines in the text

True.

IgnoreExtraChars

Ignore any characters in a line after all fields have been parsed

True.

AllowIncompleteRecords

Lines do not need to contain all fields expected by a record

False.

HeadingLineCount

Number of heading lines

0

MainHeadingLineIndex

Index of Main Heading line

0

HeadingConstraint

Default Constraints applied to field headings

None

HeadingQuotedType

Default specification for how field heading values are quoted

Optional

HeadingAlwaysWriteOptionalQuote

Default specifier for whether field heading optional quotes should be written

True.

HeadingWritePrefixSpace

Default specifier for whether field headings should be prefixed with a space when written

False.

HeadingPadAlignment

Default alignment of padding for fixed width field headings

Auto

HeadingPadCharType

Default method used to pad fixed width field headings

EndOfValue

HeadingPadChar

Default character used to pad fixed width field headings

<space>

HeadingTruncateType

Default method used to truncate fixed width field headings

Right

HeadingTruncateChar

Default character used to fill truncated field headings if HeadingTruncateType = TruncateChar

#

HeadingEndOfValueChar

Default character used to flag End of Field Heading when HeadingPadCharType = EndOfValue

0x03

"Pet Name", "Age", "Color", "Date Received", "Price", "Needs Walking", "Type"

, (Years), , , (Dollars), ,

"Rover", 4.5, Brown, 12 Feb 2004, 80, True, "Dog"

"Charlie", , Gold, 5 Apr 2007, 12.3, False, "Fish"

"Molly", 2, Black, 12 Dec 2006, 25, False, "Cat"

"Gilly", , White, 10 Apr 2007, 10, False, "Guinea Pig"

The following Fielded Text Meta file specifies the structure and layout (schema) of the above text file.

<?xml version="1.0" encoding="utf-16"?>

<FieldedText HeadingLineCount="2">

  <Field Name="PetName" />

  <Field DataType="Float" Name="Age" />

  <Field Name="Color" />

  <Field DataType="DateTime" Name="DateReceived" Format="d MMM yyyy" />

  <Field DataType="Decimal" Name="Price" />

  <Field DataType="Boolean" Name="NeedsWalking" />

  <Field Name="Type" />

</FieldedText>

Following is a Declared Fielded Text file which contains the above CSV text together with the its meta embedded as comments. The ~ character specifies a comment line.

~|!Fielded Text^| Version="1.0"

~ MetaEmbedded="True"

~ <?xml version="1.0" encoding="utf-16"?>

~ <FieldedText LineCommentChar="~" HeadingLineCount="2">

~ <Field Name="PetName" />

~ <Field DataType="Float" Name="Age" />

~ <Field Name="Color" />

~ <Field DataType="DateTime" Name="DateReceived" Format="d MMM yyyy" />

~ <Field DataType="Decimal" Name="Price" />

~ <Field DataType="Boolean" Name="NeedsWalking" />

~ <Field Name="Type" />

~ </FieldedText>

"Pet Name", "Age", "Color", "Date Received", "Price", "Needs Walking", "Type"

, (Years), , , (Dollars), ,

"Rover", 4.5, Brown, 12 Feb 2004, 80, True, "Dog"

"Charlie", , Gold, 5 Apr 2007, 12.3, False, "Fish"

"Molly", 2, Black, 12 Dec 2006, 25, False, "Cat"

"Gilly", , White, 10 Apr 2007, 10, False, "Guinea Pig"

Fielded Text file Structure

A Fielded Text file consists of 2 main parts: Header and Body.  The Body contains the lines which hold the data (the records).  The Headers consist of all the lines prior to the Body (including heading lines).

At a more detail level, the header part of the file can be split into the following sections:

The body part of the text file begins either:

The record part can contain record lines and comment line (and possibly ignored blank lines).  A record line contains the actual data - ie. a row of values.  Each record line consists of a sequence of field values.  The format of these field values is specified by the Meta.  The Meta also specifies the structure of the record lines, including how field values are separated.

It is possible for a record to span multiple lines in the text file.  This will occur when a field value contains an "End of Line" character(s).  If a record does span multiple lines, then any line in that record will not be treated as comment line or an ignored blank line.  Accordingly, it is possible for lines in the body part to begin with a line comment character but not be treated as a comment line.

Fielded Text Meta file Structure

The Meta contains the following groups of information:

Main Section which specifies properties applying to the whole text file.  In the above example Meta file, the attributes in the <FieldedText> element apply to the whole text file and make up the main section.  The Main Section can contain the following properties/attributes:

Field Sections which specify the properties of each field of data used within the text file.  In the above example Meta file, the attributes in a <Field> element apply to the respective field and make up a field section.  A Field Section can contain the following properties/attributes:

Fields can have a DataType of: String, Boolean, Integer, Float, Decimal (similar to Float but better suited for financial calculations) or DateTime.  Some of the attributes listed above are not applicable to all field DataTypes and some use different values in different DataTypes.

Substitution Sections specify which substitutions are used within the text file. Substitutions are similar to Escape Sequences used in some CSV files (eg \n).  A Substitution Section can contain the following properties/attributes:

Each Sequence has a series of <Item> elements which specify the fields included in the sequence.  An <Item> element can contain the following properties/attributes:


An <Item> element can also contain a series of <Redirect> elements.  The <Redirect> elements determine which sequence should be invoked if a field contains specified values.  A <Redirect> element can contain the following properties/attributes:

Attribute

Description

Default

DataType

Field Data Type

String

Index

Explicitly specifies position of field


Id

Tag available for User Definition

0

Name

Field Name

<Blank>

FixedWidth

Specifies whether field has a fixed number of characters

False.

Width

Number of characters in field if FixedWidth = True

1

HeadingConstraint

Constraints applied to headings

Main HeadingConstraint

Constant

Field is a constant

False.

ValueQuotedType

Specification for how field values are quoted

Optional

ValueAlwaysWriteOptionalQuote

Specifier for whether a value's optional quotes should be written

False.

ValueWritePrefixSpace

Specifier for whether values should be prefixed with a space when written

False.

ValuePadAlignment

Alignment of padding for fixed width field values

Auto

ValuePadCharType

Method used to pad fixed width field values

EndOfValue

ValuePadChar

Character used to pad fixed width field values

Depends on DataType

ValueTruncateType

Method used to truncate fixed width field values

Exception

ValueTruncateChar

Character used to fill truncated field values if ValueTruncateType = TruncateChar

#

ValueEndOfValueChar

Character used to flag End of Field Value when ValuePadCharType = EndOfValue

0x03

ValueNullChar

Character used to fill truncated field values if ValueTruncateType = NullChar

*

HeadingQuotedType

Specification for how heading values are quoted

Main HeadingQuotedType

HeadingAlwaysWriteOptionalQuote

Specifier for whether heading optional quotes should be written

Main HeadingAlwaysWriteOptionalQuote

HeadingWritePrefixSpace

Specifier for whether headings should be prefixed with a space when written

Main HeadingWritePrefixSpace

HeadingPadAlignment

Alignment of padding for fixed width field headings

Main HeadingPadAlignment

HeadingPadCharType

Method used to pad fixed width field headings

Main HeadingPadCharType

HeadingPadChar

Character used to pad fixed width field headings

Main HeadingPadChar

HeadingTruncateType

Method used to truncate fixed width field headings

Main HeadingTruncateType

HeadingTruncateChar

Character used to fill truncated field headings if HeadingTruncateType = TruncateChar

Main HeadingTruncateChar

HeadingEndOfValueChar

Character used to flag End of Field Heading when HeadingPadCharType = EndOfValue

Main HeadingEndOfValueChar

Headings

Field Headings as comma text

<Blank>

Null

Specifies whether field value is Null if Constant = True

False.

Value

Specifies field value if Constant = True

Depends on DataType

Format

Text format of field value

Depends on DataType

Styles

Either restrict or allow additional formatting when parsing text field values

Depends on DataType

FalseText

Text presentation of Boolean field False value

False.

TrueText

Text presentation of Boolean field True value

True.

Attribute

Description

Default

Index

Explicitly specifies position of Redirect


Type

Specifies type of comparison Redirect makes with Field Value

Depends on DataType of Sequence Item's Field

SequenceName

Name of Sequence to be invoked if the Field's Value matches the Redirect Value


InvokationDelay

Specifies whether specified Sequence should be invoked after current field or after current sequence

AfterField

Value

Value against which Field Value is compared

Depends on Redirect Type

Attribute

Description

Default

Type

The type of substitution

String

Token

A character which determines the substitution to be invoked


Value

The string value to replace the substitution character and token (if Type = String)


Attribute

Description

Default

Name

Name of Sequence


Root

Specifies whether this is the first sequence invoked for each record (line)

False.

FieldIndices

Shorthand list of fields in this sequence (Field indices array in commatext string)

<Blank>

Attribute

Description

Default

Index

Explicitly specifies position of Sequence Item in Sequence


FieldIndex

Index of field (in Field List) used by this Sequence Item


Sequence Sections.  A Fielded Text file can have lines with different sets of fields depending on the value of a key field(s).  The Sequence Sections in the Meta File specify the sequence of fields which can follow a key field.  A Sequence Section can contain the following properties/attributes: