Triple-S Schema and Data Files
Triple-S
Data Import expects the Triple-S schema definition file to be in Triple-S standard format. The following is a list of points to note when creating Triple-S schema definition files.
For installations previously using Generic Refresh, some changes may be required to the schema and data files previously used.
- If the format of the data file is not specified in the Triple-S definition file then it is assumed to be fixed format. The format required should be specified within the <record> element, and can either be set to “fixed” or “csv”. For example:
- If the format of a single variable is not specified in the Triple-S definition file then it is assumed to be a categoric numeric. The format required should be specified within the <variable> element, and can either be set to “literal” or “numeric”. For example:
- Quantity variables require a mandatory
tag which must contain either a range of values or specific values. Generic Refresh allowed the values tag to be empty. (Note that if a range is used for a quantity, no validation is performed using the values – it is only used to determine the size of the underlying database column). - When importing dates, the dates must be specified in the format YYYYMMDD within the data file.
- Data Import will validate the Triple-S definition file before processing can continue. In particular, a
tag is invalid within a character variable definition, and a tag is required only for a character variable definition – it is invalid for all other variable types. For complete documentation about the Triple-S standard see the website http://www.triple-s.org. Validation tools may also be useful, e.g. http://triples-validator.appspot.com/ - If the schema definition file is not encoded as UTF-8, the encoding needs to be supplied in the schema definition. For example:
- Note that if the data file contains a value for a coded variable that has not been defined in the Triple-S definition, the whole record will be rejected.
- When importing categoric text values, Data Import performs case-insensitive matching of the codes.
- Note that when using multiple variables the codes must start from 1 and be sequential. There should be no missing sequence numbers.
<record ident="A" format="csv"
<variable ident="13" type="single" format="literal">
Generic Refresh treated all singles as categoric text, so the addition of format=”literal” may be required to single variables previously created using Generic Refresh.
<?xml version="1.0" encoding="ISO-8859-1"?>
Sample Triple-S Files
The sample Triple-S files below contain an example of each data type.
The example assumes the following (both of which are defined in the Triple-S configuration file):
- the PanellistID is the identifying field, and all other fields are profile data.
- the labelName attribute is set to the default of “name” and therefore the Q_Panel variables will be named using the <name> attribute.
An example of a Q_Panel extended date time variable is also shown. In order for this to work, the following namespace must be added to the Triple-S definition file, i.e.
<sss version="2.0" xmlns:marsc="http://www.marsc.com/”>
Sample CSV format Triple-S definition file
<?xml version="1.0"?> <sss version="2.0" xmlns:marsc="http://www.marsc.com/"> <date>13-September-2012</date> <time>12:00:00</time> <origin>MARSC</origin> <survey> <name>Main Profiler</name> <title>Main Profiler</title> <record ident="A" format="csv"> <variable ident="1" type="character"> <name>PanellistID</name> <label>PanellistID</label> <position start="1" /> <size>255</size> </variable> <variable ident="2" type="single" > <name>MStat</name> <label>Marital Status</label> <position start="2" /> <values> <value code="1">(1) Married</value> <value code="2">(2) Single</value> <value code="3">(3) Divorced</value> <value code="4">(4) Co-habiting</value> </values> </variable> <variable ident="3" type="character"> <name>Postcode</name> <label>Postcode</label> <position start="3" /> <size>15</size> </variable> <variable ident="4" type="single" format="literal"> <name>SocialClass</name> <label>Social Class</label> <position start="4" /> <values> <value code="A">A</value> <value code="B">B</value> <value code="C1">C1</value> <value code="C2">C2</value> <value code="D">D</value> <value code="E">E</value> </values> </variable> <variable ident="5" type="quantity"> <name>NoInHH</name> <label>Number in Household</label> <position start="5" /> <values> <range from="1" to="10"/> </values> </variable> <variable ident="6" type="multiple"> <name>SundayNews</name> <label>Sunday Newspaper</label> <position start="6" /> <values> <value code="1">Independent on Sunday</value> <value code="2">Mail on Sunday</value> <value code="3">Observer</value> <value code="4">Sunday Express</value> <value code="5">Sunday Mirror</value> <value code="6">Sunday Telegraph</value> <value code="7">Sunday Times</value> </values> </variable> <variable ident="7" type="logical"> <name>OwnMobilePhone</name> <label>OwnMobilePhone</label> <position start="7" /> </variable> <variable ident="8" type="date"> <name>DOB</name> <label>Date of Birth</label> <position start="8" /> </variable> <variable ident="9" type="time"> <name>TOB</name> <label>Time of Birth</label> <position start="9" /> </variable> <variable ident="10" type="date" marsc:extended-type="datetime"> <name>Next Appointment</name> <label>Next Appointment</label> <position start="10" /> </variable> </record> </survey>
Sample CSV format Triple-S data file
A csv data file which matches the sample Triple-S definition file above could contain the following data:
--- 10030,1,GU1,C1,4,0011010,Y,19730321,093023,20150720123030 ---
Sample Fixed format Triple-S definition file
<?xml version="1.0"?> <sss version="2.0" xmlns:marsc="http://www.marsc.com/"> <date>13-September-2012</date> <time>12:00:00</time> <origin>MARSC</origin> <survey> <name>Main Profiler</name> <title>Main Profiler</title> <record ident="A" format="fixed"> <variable ident="1" type="character"> <name>PanellistID</name> <label>PanellistID</label> <position start="1" finish="10"/> <size>255</size> </variable> <variable ident="2" type="single" > <name>MStat</name> <label>Marital Status</label> <position start="11" finish="11"/> <values> <value code="1">(1) Married</value> <value code="2">(2) Single</value> <value code="3">(3) Divorced</value> <value code="4">(4) Co-habiting</value> </values> </variable> <variable ident="3" type="character"> <name>Postcode</name> <label>Postcode</label> <position start="12" finish="26"/> <size>15</size> </variable> <variable ident="4" type="single" format="literal"> <name>SocialClass</name> <label>Social Class</label> <position start="27" finish="28"/> <values> <value code="A">A</value> <value code="B">B</value> <value code="C1">C1</value> <value code="C2">C2</value> <value code="D">D</value> <value code="E">E</value> </values> </variable> <variable ident="5" type="quantity"> <name>NoInHH</name> <label>Number in Household</label> <position start="29" finish="30"/> <values> <range from="1" to="10"/> </values> </variable> <variable ident="6" type="multiple"> <name>SundayNews</name> <label>Sunday Newspaper</label> <position start="31" finish="37"/> <values> <value code="1">Independent on Sunday</value> <value code="2">Mail on Sunday</value> <value code="3">Observer</value> <value code="4">Sunday Express</value> <value code="5">Sunday Mirror</value> <value code="6">Sunday Telegraph</value> <value code="7">Sunday Times</value> </values> </variable> <variable ident="7" type="logical"> <name>OwnMobilePhone</name> <label>OwnMobilePhone</label> <position start="38" finish="38"/> </variable> <variable ident="8" type="date"> <name>DOB</name> <label>Date of Birth</label> <position start="39" finish="46"/> </variable> <variable ident="9" type="time"> <name>TOB</name> <label>Time of Birth</label> <position start="47" finish="52"/> </variable> <variable ident="10" type="date" marsc:extended-type="datetime"> <name>Next Appointment</name> <label>Next Appointment</label> <position start="53" finish="66"/> </variable> </record> </survey> </sss>
Sample Fixed format Triple-S data file
A fixed format data file which matches the sample Triple-S definition file above could contain the following data:
--- 10030 1GU1 C1 40011010Y1973032109302320150720123030 ---