Masking CSV files

This walkthrough explains how to mask a CSV file by building a package that loads the given CSV file, performs masking and creates a new, masked CSV file.

Table of contents
Preconditions
Masking activities that are going to be used
Steps

Preconditions

  • BizDataX Package (Visual Studio project) is created
  • CSV file that is going to be masked is saved in C:\Temp
  • Content of the CSV file must have the following format: first row must contain the file headers (e.q. "FirstName" and "LastName"), following rows contain data. Values can be separated by a comma, or any other character, in which case it's necessary to edit the handler content (the file used in this walkthrough can be downloaded here)

Masking activities that are going to be used

  • Step
  • Sequence
  • Masking engine
  • Masking iterator
  • API handler
  • Pick US first name from list
  • Pick US last name from list

Steps

  1. From Solution Explorer open the Code.cs file.

  2. In Code.cs add class Customer with properties equal to the .csv headers.

    using System; 
    using System.Collections.Generic; 
    using System.Linq; 
    using System.Text;  
    
    namespace BizDataXPackage
    {
        /// <summary>
        /// Contains custom code to be used during package execution.
        /// </summary>
        public static class Code
        {
            // TODO: Add your code here
        }
    
        public class Customer
        {
            public String FirstName { get; set; }
            public String LastName { get; set; }
        }
    }
    
  3. Save your changes and build the project.

  4. First few rows of the CSV file that is going to be masked:

    FirstName,LastName
    Tabatha,Beard
    Lukas,Warner
    Jessika,Luna
    Breanna,Hess
    Malik,Strong
    Hilary,Stark
    Shelbi,Bond
    Trinity,Rush
    
  5. From the Solution Explorer open the Package.xaml file.

  6. Delete the workflow created by default.

  7. Drag the Step activity from the Toolbox into the opened Package.xaml.

  8. Drag the Masking engine from the Toolbox into the Step.

  9. Drag the Masking iterator from the Toolbox into the Masking engine.

  10. In the Select Types pop-up window select Browse for Types... and find Customer.

  11. Click OK on both pop-up windows.

  12. Drag the API handler from the Toolbox into the Masking iterator, in the place that says "Drop data handler here".

  13. In the API handler write

    api=>api.Delimited().GetHandler<Customer>("C:/Temp/Customers.csv","C:/Temp/Customers_masked.csv")
    
  14. Drag the Pick US first name from list masking activity from the Toolbox into the Masking iterator, where it says "Drop activities here".

  15. In the Pick US first name from list masking activity, select FirstName as the "Property".

  16. Do the same for the Pick US last name from list masking activity (select LastName as "Property").

  17. Save your changes and start package execution (Debug->Start Without Debugging).

  18. Open the C:\Temp directory to check for changes. A new file named customers_masked.csv is created and it contains masked data from the original customers.csv file.

    FirstName,LastName
    Freddy,Monroe
    Allison,Ruiz
    Chad,Franklin
    Marisa,Riddle
    Nolan,Carney
    Jerrica,Flynn
    Alejandro,Bray
    Priscilla,Galloway
    

CSV file masking Figure 1: CSV file masking

Note: API Handler can be configured with additional options (custom delimiters, different encodings, etc):

API handler options Figure 2: API handler options