Triple-S Schema and Data Files
Triple-S
Data Import expects the Triple-S schema definition file to be in Triple-S standard format. The following is a list of points to note when creating Triple-S schema definition files.
For installations previously using Generic Refresh, some changes may be required to the schema and data files previously used.
- If the format of the data file is not specified in the Triple-S definition file then it is assumed to be fixed format. The format required should be specified within the <record> element, and can either be set to “fixed” or “csv”. For example:
- If the format of a single variable is not specified in the Triple-S definition file then it is assumed to be a categoric numeric. The format required should be specified within the <variable> element, and can either be set to “literal” or “numeric”. For example:
- Quantity variables require a mandatory
tag which must contain either a range of values or specific values. Generic Refresh allowed the values tag to be empty. (Note that if a range is used for a quantity, no validation is performed using the values – it is only used to determine the size of the underlying database column). - When importing dates, the dates must be specified in the format YYYYMMDD within the data file.
- Data Import will validate the Triple-S definition file before processing can continue. In particular, a
tag is invalid within a character variable definition, and a tag is required only for a character variable definition – it is invalid for all other variable types. For complete documentation about the Triple-S standard see the website http://www.triple-s.org. Validation tools may also be useful, e.g. http://triples-validator.appspot.com/ - If the schema definition file is not encoded as UTF-8, the encoding needs to be supplied in the schema definition. For example:
- Note that if the data file contains a value for a coded variable that has not been defined in the Triple-S definition, the whole record will be rejected.
- When importing categoric text values, Data Import performs case-insensitive matching of the codes.
- Note that when using multiple variables the codes must start from 1 and be sequential. There should be no missing sequence numbers.
<record ident="A" format="csv"
<variable ident="13" type="single" format="literal">
Generic Refresh treated all singles as categoric text, so the addition of format=”literal” may be required to single variables previously created using Generic Refresh.
<?xml version="1.0" encoding="ISO-8859-1"?>
Sample Triple-S Files
The sample Triple-S files below contain an example of each data type.
The example assumes the following (both of which are defined in the Triple-S configuration file):
- the PanellistID is the identifying field, and all other fields are profile data.
- the labelName attribute is set to the default of “name” and therefore the Q_Panel variables will be named using the <name> attribute.
An example of a Q_Panel extended date time variable is also shown. In order for this to work, the following namespace must be added to the Triple-S definition file, i.e.
<sss version="2.0" xmlns:marsc="http://www.marsc.com/”>
Sample CSV format Triple-S definition file
<?xml version="1.0"?>
<sss version="2.0" xmlns:marsc="http://www.marsc.com/">
<date>13-September-2012</date>
<time>12:00:00</time>
<origin>MARSC</origin>
<survey>
<name>Main Profiler</name>
<title>Main Profiler</title>
<record ident="A" format="csv">
<variable ident="1" type="character">
<name>PanellistID</name>
<label>PanellistID</label>
<position start="1" />
<size>255</size>
</variable>
<variable ident="2" type="single" >
<name>MStat</name>
<label>Marital Status</label>
<position start="2" />
<values>
<value code="1">(1) Married</value>
<value code="2">(2) Single</value>
<value code="3">(3) Divorced</value>
<value code="4">(4) Co-habiting</value>
</values>
</variable>
<variable ident="3" type="character">
<name>Postcode</name>
<label>Postcode</label>
<position start="3" />
<size>15</size>
</variable>
<variable ident="4" type="single" format="literal">
<name>SocialClass</name>
<label>Social Class</label>
<position start="4" />
<values>
<value code="A">A</value>
<value code="B">B</value>
<value code="C1">C1</value>
<value code="C2">C2</value>
<value code="D">D</value>
<value code="E">E</value>
</values>
</variable>
<variable ident="5" type="quantity">
<name>NoInHH</name>
<label>Number in Household</label>
<position start="5" />
<values>
<range from="1" to="10"/>
</values>
</variable>
<variable ident="6" type="multiple">
<name>SundayNews</name>
<label>Sunday Newspaper</label>
<position start="6" />
<values>
<value code="1">Independent on Sunday</value>
<value code="2">Mail on Sunday</value>
<value code="3">Observer</value>
<value code="4">Sunday Express</value>
<value code="5">Sunday Mirror</value>
<value code="6">Sunday Telegraph</value>
<value code="7">Sunday Times</value>
</values>
</variable>
<variable ident="7" type="logical">
<name>OwnMobilePhone</name>
<label>OwnMobilePhone</label>
<position start="7" />
</variable>
<variable ident="8" type="date">
<name>DOB</name>
<label>Date of Birth</label>
<position start="8" />
</variable>
<variable ident="9" type="time">
<name>TOB</name>
<label>Time of Birth</label>
<position start="9" />
</variable>
<variable ident="10" type="date" marsc:extended-type="datetime">
<name>Next Appointment</name>
<label>Next Appointment</label>
<position start="10" />
</variable>
</record>
</survey>
Sample CSV format Triple-S data file
A csv data file which matches the sample Triple-S definition file above could contain the following data:
--- 10030,1,GU1,C1,4,0011010,Y,19730321,093023,20150720123030 ---
Sample Fixed format Triple-S definition file
<?xml version="1.0"?>
<sss version="2.0" xmlns:marsc="http://www.marsc.com/">
<date>13-September-2012</date>
<time>12:00:00</time>
<origin>MARSC</origin>
<survey>
<name>Main Profiler</name>
<title>Main Profiler</title>
<record ident="A" format="fixed">
<variable ident="1" type="character">
<name>PanellistID</name>
<label>PanellistID</label>
<position start="1" finish="10"/>
<size>255</size>
</variable>
<variable ident="2" type="single" >
<name>MStat</name>
<label>Marital Status</label>
<position start="11" finish="11"/>
<values>
<value code="1">(1) Married</value>
<value code="2">(2) Single</value>
<value code="3">(3) Divorced</value>
<value code="4">(4) Co-habiting</value>
</values>
</variable>
<variable ident="3" type="character">
<name>Postcode</name>
<label>Postcode</label>
<position start="12" finish="26"/>
<size>15</size>
</variable>
<variable ident="4" type="single" format="literal">
<name>SocialClass</name>
<label>Social Class</label>
<position start="27" finish="28"/>
<values>
<value code="A">A</value>
<value code="B">B</value>
<value code="C1">C1</value>
<value code="C2">C2</value>
<value code="D">D</value>
<value code="E">E</value>
</values>
</variable>
<variable ident="5" type="quantity">
<name>NoInHH</name>
<label>Number in Household</label>
<position start="29" finish="30"/>
<values>
<range from="1" to="10"/>
</values>
</variable>
<variable ident="6" type="multiple">
<name>SundayNews</name>
<label>Sunday Newspaper</label>
<position start="31" finish="37"/>
<values>
<value code="1">Independent on Sunday</value>
<value code="2">Mail on Sunday</value>
<value code="3">Observer</value>
<value code="4">Sunday Express</value>
<value code="5">Sunday Mirror</value>
<value code="6">Sunday Telegraph</value>
<value code="7">Sunday Times</value>
</values>
</variable>
<variable ident="7" type="logical">
<name>OwnMobilePhone</name>
<label>OwnMobilePhone</label>
<position start="38" finish="38"/>
</variable>
<variable ident="8" type="date">
<name>DOB</name>
<label>Date of Birth</label>
<position start="39" finish="46"/>
</variable>
<variable ident="9" type="time">
<name>TOB</name>
<label>Time of Birth</label>
<position start="47" finish="52"/>
</variable>
<variable ident="10" type="date" marsc:extended-type="datetime">
<name>Next Appointment</name>
<label>Next Appointment</label>
<position start="53" finish="66"/>
</variable>
</record>
</survey>
</sss>
Sample Fixed format Triple-S data file
A fixed format data file which matches the sample Triple-S definition file above could contain the following data:
--- 10030 1GU1 C1 40011010Y1973032109302320150720123030 ---