Guidelines for Supporting ISO Code Sets
Recommendation, 2007 April 15
Editor:
Chuck Allen, HR-XML Consortium, Inc.
Paul Kiel, HR-XML Consortium, Inc.
Kim Bartkus, HR-XML Consortium, Inc.
Authors:
Chuck Allen, HR-XML Consortium, Inc.
Contributors:
These guidelines draw heavily upon this technical note
prepared by Bill Kerr:
http://www.hr-xml.org/resources/TSC-ISO-Standards-v00-03.pdf
It also incorporates the “utility schemas” developed by Paul Kiel.
Copyright © 2007 HR-XML Consortium, Inc.
Abstract
A growing number of approved HR-XML specifications use a set of agreed-upon XML schema types to represent ISO codes for currency, country, gender, and names of languages. The purpose of this recommendation is to document the use of these types and to formalize their adoption as Cross-Process Objects.
Table of Contents
3.2 Uses/Implementation Considerations
4.2 Uses/Implementation Considerations
5.2 Uses/Implementation Considerations
6.1 Standards for Representation of Languages
6.3 Uses/Implementation Considerations
6.3.1 Use of “xml:lang” (language as metadata)
6.3.2 Use of XML Schema data type “language” (language as data)
8 Appendix A - Document Version History
9 Appendix B – Related Documents
The International Organization for Standardization (ISO) is a worldwide federation of national standards bodies. This document provides guidance on representing ISO codes for currency, country, gender, and names of languages within HR-XML Consortium specifications.
This document recommends that implementers of HR-XML Consortium specifications conform to ISO codes for currency, country, gender, and names of languages. The document also sets out XML schema types that HR-XML workgroups SHOULD incorporate within their specifications to support conformance with the ISO codes.
The XML schema types presented in this document are designed to support (but do not directly include) code sets that have been standardized by the International Organization for Standardization. In each case, the schema includes a pattern to ensure that the code transmitted conforms to the format – but not necessarily the content – prescribed under the ISO standard.
An alternative approach considered, but rejected, by the HR-XML Consortium’s Technical Steering Committee was to enumerate the actual codes prescribed under the ISO standards within HR-XML schemas. The approach of specifying a pattern to enforce the format of the particular ISO code set was favored over enumerating the codes themselves for the following reasons:
§ Including the ISO code sets by reference, instead of including the codes directly, spares the HR-XML Consortium the burden of having to edit and update enumerations each time a new version of the particular ISO specification is published.
§ Many ISO specifications are copyrighted and available only if a licensing fee is paid. By merely recommending the use of the ISO codes, but not including their content, HR-XML avoids any possible copyright or licensing controversies.
§ The approach of enforcing the format of the codes, rather than the content, is relatively simple. In most cases, the format of the code will remain constant, while the actual code content will change over time. Enforcing the format of the codes adds little overhead to the development and maintenance of HR-XML specifications, yet it supports conformance with recognized international standards.
§ Business logic for handling country and language code data and other codes often will be handled by a receiving system regardless of whether interchange data was validated against a complete list of ISO codes. For example, many companies – even those with wide-ranging global operations – may not have a compelling business reason to use the full 136 language codes within ISO 639-1:2002, Code for the Representation of Names of Languages. In such cases, parsing data against an enumerated list of the full 136 ISO-recognized language codes would add some overhead without a commensurate benefit for the implementer.
For purposes of indicating gender within HR-XML Consortium specifications, the codes prescribed by ISO 5218 (ISO 5218:1977 Information interchange -- Representation of human sexes) are RECOMMENDED.
ISO 5218 specifies the use of the following codes:
|
Code |
Definition |
|
0 |
Not Known |
|
1 |
Male |
|
2 |
Female |
|
9 |
Not specified |
The following schema supports the use of ISO 5218. As explained in Section 2, Design Approach, an enumeration of the actual codes is not contained in the schema.
<xsd:schema targetNamespace="http://ns.hr-xml.org/2007-04-15" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ns.hr-xml.org/2007-04-15" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:element name="GenderCode" type="GenderCodeType"/>
<xsd:simpleType name="GenderCodeType">
<xsd:annotation>
<xsd:documentation>Must conform to ISO 5218 - Representation of Human Sexes (0 - Not Known; 1 - Male; 2 - Female; 9 - Not specified)</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:integer">
<xsd:pattern value="[0129]"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
The use of ISO 5218 ensures a uniform representation of gender in HR-XML specifications. The ISO convention for specifying “not known” or “not specified” helps eliminate ambiguity and allows the sender of data to deliberately exclude the specification of gender. For example, the “not specified” option may be useful when it is illegal or inappropriate to transmit information about a person’s gender.
For purposes of identifying countries within HR-XML Consortium specifications, the two-letter codes prescribed by ISO 3166-1 (ISO 3166-1, Codes for the representation of names of countries and their subdivisions) are RECOMMENDED.
The following schema enforces a two-letter, uppercase pattern to support the use of ISO 3166-1. As explained in Section 2, Design Approach, an enumeration of the actual codes is not contained in the schema.
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://ns.hr-xml.org/2007-04-15" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ns.hr-xml.org/2007-04-15" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:element name="CountryCode" type="CountryCodeType"/>
<xsd:simpleType name="CountryCodeType">
<xsd:annotation>
<xsd:documentation>Must conform to ISO 3166-1 Representation of Countries.</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:string">
<xsd:pattern value="[A-Z][A-Z]"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
Examples of the use of CountryCodeType might include information related to the location of an organization or the nationality of a human resource.
The use of standard codes for representing currency is important to making HR-XML’s data exchange standards useful in many different countries and business scenarios.
For the purpose of indicating currency with HR-XML specifications, three-letter ISO 4217 codes are RECOMMENDED (ISO 4217:2001 Codes for the representation of currencies and funds). ISO's currency codes are based on the ISO country codes. The currency codes are made up of the two-character country code (ISO 3166-1), plus a one-character currency designator.
Any HR-XML Consortium schema that represents a monetary amount SHOULD incorporate the CurrencyCodeType to support the capture of a conforming ISO 4217 code. Schema designs MAY allow for the transmission of a currency code to be optional, in which case it is assumed that the currency is understood or agreed upon by trading partners.
The following schema enforces a three-letter, uppercase pattern to support the use of ISO 4217 codes. As explained in Section 2, Design Approach, an enumeration of the actual codes is not contained in the schema.
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://ns.hr-xml.org/2007-04-15" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ns.hr-xml.org/2007-04-15" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:element name="CurrencyCode" type="CurrencyCodeType"/>
<xsd:simpleType name="CurrencyCodeType">
<xsd:annotation>
<xsd:documentation>Must conform to ISO 4217 - Representation of Currency and Funds</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:string">
<xsd:pattern value="[A-Z][A-Z][A-Z]"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
Currency generally would be an attribute of an element requiring a monetary amount. However, other designs may be appropriate.
The W3C has a special attribute ”xml:lang”, which specifies the language of the element or attribute content. This attribute would be used in conjunction with the ISO 639-1 codes.
For the purpose of the exchanges using HR-XML specifications, two-letter ISO 639-1 codes are preferred (ISO 639-1:2002 Code for the Representation of Names of Languages). This code set covers 136 languages and is in general use today.
ISO 639-2 is a three-character representation of languages. This code set covers 460 languages and addresses terminological as well as bibliographic needs. However, ISO 639-2 is not as widely implemented as ISO 639-1 and thus is not recommended as the basis for HR-XML schemas.
Another standard relating to the representation of languages is the IETF (Internet Engineering Task Force) 1766. This standard relates to language tags used in MIME/web applications. It builds on the codes define in ISO 639-1.
The following schema is created for the purpose of enabling consistency in usage.
<xsd:schema targetNamespace="http://ns.hr-xml.org/2007-04-15" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://ns.hr-xml.org/2007-04-15" elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="../../W3C/xml.xsd"/>
<!-- language as data, capturing the language itself -->
<xsd:element name="LanguageCode" type="LanguageCodeType"/>
<xsd:simpleType name="LanguageCodeType">
<xsd:annotation>
<xsd:documentation>ISO 639-1 two character code is preferred, but not required.</xsd:documentation>
</xsd:annotation>
<xsd:restriction base="xsd:language"/>
</xsd:simpleType>
<!-- language as metadata, capturing a general descriptive string in a given language -->
<xsd:element name="LanguageDependentText" type="LanguageDependentTextType"/>
<xsd:complexType name="LanguageDependentTextType">
<xsd:simpleContent>
<xsd:extension base="xsd:string">
<xsd:attribute ref="xml:lang">
<xsd:annotation>
<xsd:documentation>ISO 639-1 two character code is preferred, but not required.</xsd:documentation>
</xsd:annotation>
</xsd:attribute>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:schema>
Sample instances:
<!-- Japanese -->
<LanguageCode xmlns="http://ns.hr-xml.org/2007-04-15">ja</LanguageCode>
<!-- data is in French -->
<LanguageDependentText xmlns="http://ns.hr-xml.org/2007-04-15" xml:lang="fr">adieu</LanguageDependentText>
![]()
The official RECOMMENDATION of the Consortium is to use the “xml:lang” attribute, as defined by the W3C, to indicate the language of data content. For example, the following indicates that the Address data is in English.
<Address xml:lang=”en”>123 Main Street</Address>
The LanguageDependentText and LanguageDependentTextType can be used as a standard element of text that is language dependent.
When the language itself is needed as a data object, such as needing to capture what language a person speaks, then the XML Schema “language” data type is used. For example, the following shows that a benefit plan participant speaks German.
<ParticipantPrimaryLanguage>de</ ParticipantPrimaryLanguage>
The LanguageCode and LanguageCodeType can be used to capture language as a data element. In the interest of simplicity, the Consortium preference is to use the ISO 639-1 two-character code when possible; however, this is not required.
Human resources data, by its very nature, is personal data. The laws of many jurisdictions as well as codes of fair information practice require organizations to handle personal data in a way that protects individuals from loss of privacy.
The data exchange specifications developed by the HR-XML Consortium are designed to be useful across many jurisdictions and within a variety of business contexts. It is not feasible for the HR-XML Consortium to develop specific privacy guidance for every jurisdiction or business context in which the Consortium's specifications might be implemented. When implementing data exchanges using the HR-XML Consortium's data definitions (or, for that matter, any data exchange mechanism), organizations are advised to examine the privacy protections that may be required under applicable law or codes of fair information practice.
For information on protecting personal data, general references include: European Union Data Protection Directive (95/46/EC); the Association Computing Machinery Code of Ethics (1992); Canadian Standards Association Model Code for the Protection of Personal Information (1995 -- PIPEDA); U.S.-EU Safe Harbor Principles and FAQs (2000).
|
Date |
Description |
|
2002-08-28 |
Draft |
|
2002-11-18 |
Added xml:lang information |
|
2002-12-16 |
Added LanguageDependentText information |
|
2002-12-30 |
Removed element LanguageDependentText. (Type is still included). Clarified ISO references. |
|
2003-02-26 |
Approved recommendation by HR-XML Consortium. The default and targetNamespaces of all HR-XML schemas have been standardized. This recommendation is available as part of the HR-XML 2_0 architecture. |
|
2006-02-28 |
Approved by Consortium |
|
2007-Apr-15 |
Approved by Consortium |
The following documents are available for purchase from ISO (http://www.iso.org/):
ISO 5218:1977 Information interchange -- Representation of human sexes
ISO 639-1:2002 Codes for the representation of names of languages -- Part 1: Alpha-2 code
ISO 4217:2001 Codes for the representation of currencies and funds
ISO 3166-2:1998 Codes for the representation of names of countries and their subdivisions -- Part 2: Country subdivision code
ISO 3166-3:1999 Codes for the representation of names of countries and their subdivisions -- Part 3: Code for formerly used names of countries
Information on the details of xml:lang may be found at the W3C website:
http://www.w3.org/TR/2000/WD-xml-2e-20000814#sec-lang-tag