Data Masking a Canadian Social Insurance Number
ÂŠ 2009 Informatica Corporation
Abstract A Social Insurance number (SIN) is a number that the Canadian government issues to administer social programs. This article describes a Data Masking mapplet that you can configure to create a realistic SIN with a valid checksum.
Overview The Canadian Social Insurance number is an account number for the Canadian Pension Plan, unemployment insurance, and other government programs. The SIN is similar to the United States Social Security number. The following example shows how to configure a Data Masking mapplet to mask a Canadian SIN. The example includes a Data Masking transformation to change the SIN. It also includes an Expression transformation to format the SIN, and another Expression transformation that calculates a valid checksum number.
Source Data The source data is either nine characters without dashes or an eleven-character string in the following format: 123-456-789
The last character of the number is a checksum number.
Mapplet The mapplet includes an Expression transformation that converts a 9 character SIN to an 11 character SIN. A Data Masking transformation masks the 11 character SIN. Another Expression transformation calculates a checksum number and replaces the last character of the SIN. The following figure shows the mapplet:
The mapplet has the following transformations:
Input. Input transformation that receives the Canadian SIN from the PowerCenter mapping. Passes the number to the Expression transformation.
Exp_SIN_Formating. Expression transformation that converts a 9 character SIN to an 11 character SIN that contains dashes.
DM_Mask_SIN. Data Masking transformation that creates a key mask for the SIN.
Exp_Validate_SIN. Expression transformation that creates a checksum number.
Output. Output transformation that passes the masked SIN back to the PowerCenter mapping.
Input Transformation The Input transformation receives the SIN. Connect the Input transformation to the Source Qualifier in the PowerCenter mapping.
Exp_SIN_Formatting Expression Transformation The Exp_SIN_Formatting transformation receives the Canadian SIN number. If the number is less than 9 characters, the transformation pads the SIN with zeroes. If the SIN does not have dashes, the Expression transformation adds dashes to it. The Expression transformation contains the following ports: Port
Social Insurance number.
LPAD(SIN, 9, '0' )
If SIN is less than 9 characters, the expression pads the number with zeros on the left side.
iif( length(SIN) = 9 , substr(SIN_VAR, 1,3) || '-' || substr(SIN_VAR, 4,2) || '-' || substr(SIN_VAR, 6,4), SIN)
Adds dashes to a 9 character SIN. Returns an 11 character number.
DM_Mask_SIN Transformation The DM_Mask_SIN Data Masking transformation applies a key mask to the SIN. The Data Masking transformation has an input port and an associated output port for the SIN number. When you add a port to the Data Masking transformation, the Designer adds an output port by default. Each output port name is out_<port name>. The following figure shows the Masking Properties tab:
The Data Masking transformation applies key masking to the SIN. Key masking produces repeatable results for the same source SIN. The Data Masking transformation requires a seed value for the port when you configure it for key masking. For this example, the seed value is a default number. The mask format limits each character in the output column to a numeric character and the dashes are not masked. DDD+DDD+DDD
Expression Transformation The Expression transformation receives the SIN from the Data Masking transformation. The Expression transformation calculates a checksum number for the SIN using the Luhn algorithm. The Luhn algorithm is a checksum formula that businesses use to validate a variety of identification numbers such as credit card numbers. The following SIN has a valid checksum number according to the Luhn algorithm: 046 454 286
To test the checksum using the algorithm, multiply each digit in the SIN by the digit in the same position of the following number: 121 212 121
For example, the first digit of the SIN is zero. The first digit in the other number is one. Zero multiplied by one is zero. Zero is the first digit of the result. The second-to-last SIN digit, 8, multiplied by 2 is equal to 16. When the result is a two-digit number, add the digits together (1 + 6) and use the result, which is 7 in this example. The result is another 9 digit number: 086 858 276
Add the digits together: 0+8+6+8+5+8+2+7+6 = 50
The SIN is valid if the number is divisible by 10. The Expression transformation performs the same algorithm to calculate a checksum number. The following figure shows the ports in the Expression transformation:
The Expression transformation contains the following ports: Port
Receives the SIN from the Data Masking transformation.
TO_INTEGER(substr(out_SIN_KEY,2,1)) * 2
Multiplies the second digit in the SIN by two.
TO_INTEGER(substr(out_SIN_KEY,5,1)) * 2
Multiplies the fifth digit in the SIN by two.
TO_INTEGER(substr(out_SIN_KEY,8,1)) * 2
Multiplies the eighth digit in the SIN by two
TO_INTEGER(substr(out_SIN_KEY,10,1)) * 2
Multiplies the eighth tenth digit in the SIN by two.
8 * 2 = 16
TO_INTEGER(substr(out_SIN_KEY,1,1)) +TO_INTEGER(substr(out_SIN_KEY,3,1)) +TO_INTEGER(substr(out_SIN_KEY,6,1)) +TO_INTEGER(substr(out_SIN_KEY,9,1)) +TO_INTEGER(substr(out_SIN_KEY,11,1))
Sums digits in position 1, 3, 6, 9, and 11 of the SIN.
0 + 6 + 5 + 2 + 4 = 17
IIF(LENGTH(Sec) = 2, TO_INTEGER(substr(to_char(Sec),1,1)) + TO_INTEGER(substr(to_char(Sec),2,1)), Sec)
Adds the digits in Sec together if the length of Sec is two. Otherwise Hld_sec = Sec.
IIF(LENGTH(four) = 2, TO_INTEGER(substr(to_char(four),1,1)) + TO_INTEGER(substr(to_char(four),2,1)), four
Adds the digits in four together if the length of four is two. Otherwise Hld_four = four.
IIF(LENGTH(six) = 2, TO_INTEGER(substr(to_char(six),1,1)) + TO_INTEGER(substr(to_char(six),2,1)), six)
Adds the digits in six together if the length of six is two. Otherwise Hld_six = six.
IIF(LENGTH(eight) = 2, TO_INTEGER(substr(to_char(eight),1,1)) + TO_INTEGER(substr(to_char(eight),2,1)), eight)
Adds the digits in eight together if the length of eight is two. Otherwise Hld_eight = eight.
hld_sec + hld_four +hld_six + hld_eight + odd_values
Adds all digits together.
Subtracts the last digit of Holdvalue from 10.
10 â€“ 1 = 9
substr(to_char(high_number + TO_INTEGER(substr(out_SIN_KEY,-1))),-1)
Adds 9 to the last character of out_SIN_KEY. The last character of the result is the checksum number.
4 + 9 = 13. The checksum number is 3.
iif(substr(out_SIN_KEY,11,1) = to_char(Check_digit), out_SIN_KEY, substr(out_SIN_KEY,1,10) || to_char(check_digit))
Returns the SIN if the last digit is equal to the check_digit, otherwise appends the check_digit to the first 10 characters of the SIN.
Returns the SIN number with a new check digit.
Canadian SIN Target Table The Output transformation receives the masked Canadian SIN from the Expression transformation.
Author Ellen Chandler Principal Technical Writer
Acknowledgements Kevin Ware designed the mapplet.