# 7.4. Address ingestion via the ETL process

[![address ingestion process.png](https://doc.rncv.lu/uploads/images/gallery/2025-10/scaled-1680-/address-ingestion-process.png)](https://doc.rncv.lu/uploads/images/gallery/2025-10/address-ingestion-process.png)

<table border="1" id="bkmrk-the-background-color" style="border-collapse: collapse; width: 100%; border-width: 1px; background-color: rgb(236, 240, 241); height: 116.4375px;"><tbody><tr style="height: 116.4375px;"><td style="height: 116.4375px;">The background colors of the above image are to be interpreted as:

- <span style="color: rgb(132, 63, 161);">**purple**</span>: the core application a.k.a. backend that will be built in the context of this project
- **<span style="color: rgb(22, 145, 121);">green</span>**: supporting systems that will be used in the context of this project
- **<span style="color: rgb(35, 111, 161);">blue</span>**: trusted parties
- **<span style="color: rgb(186, 55, 42);">red</span>**: external users of the system

</td></tr></tbody></table>

The ETL process that is responsible for ingesting data from one or multiple trusted external Address Data Providers, is a key element of the RNCV. The ETL is responsible for:

- Ingest new addresses present on the Address Data Providers.
- Correct existing addresses
- Validate addresses ingested by the Data producers

You will find below a high level representation of the ingestion process. The exact algorithms used by the ETL process are out of the scope of this project and therefore are abstracted as a black box in the process description

<p class="callout warning">The ETL algorithm is out of scope of this project</p>

## ETL Process

[![etl process.png](https://doc.rncv.lu/uploads/images/gallery/2025-10/scaled-1680-/etl-process.png)](https://doc.rncv.lu/uploads/images/gallery/2025-10/etl-process.png)

<table class="simple-table" id="bkmrk-name-address-ingesti"><thead class="simple-table-header"><tr id="bkmrk-name-address-ingesti-1"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-name">Name</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-address-ingestion-vi">Address ingestion via the ETL process</th></tr></thead><tbody><tr id="bkmrk-purpose-ingest-and-v"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-purpose">**Purpose**</th><td class="align-left" id="bkmrk-ingest-and-validate-">Ingest and validate address data to maintain the quality of the RNCV’s address database database to the highest standards</td></tr><tr id="bkmrk-linked-user-stories-"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-linked-user-stories">**Linked user stories**</th><td class="align-left" id="bkmrk-4.67.-etl---retrieve">[4.67. ETL - Retrieve addresses](https://doc.rncv.lu/books/architecture-documentation/page/467-etl-retrieve-addresses "4.67. ETL - Retrieve addresses")

[4.68. ETL - Create and update addresses](https://doc.rncv.lu/books/architecture-documentation/page/468-etl-create-and-update-addresses "4.68. ETL - Create and update addresses")

</td></tr><tr id="bkmrk-apis-used-get-%2Fetl%2Fa"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-apis-used">**APIs used**</th><td class="align-left" id="bkmrk-get-%2Fetl%2Faddressespo">**GET** /etl/addresses **POST** /etl/addresses  
**GET** /etl/addresses/&lt;address-id&gt;  
**PUT** /etl/addresses/&lt;address-id&gt;  
**PATCH** /etl/addresses/&lt;address-id&gt;</td></tr><tr id="bkmrk-scope-this-process-h"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-scope">**Scope**</th><td class="align-left" id="bkmrk-this-process-handles">This process handles the ingestion of addresses into the RNCV’s address database. It also handles the correction and validation of existing address data.  
  
This exact algorithm used by the ETL process is out of scope of this process.</td></tr><tr id="bkmrk-roles-etl%2C-system"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-roles">**Roles**</th><td class="align-left" id="bkmrk-etl%2C-system">ETL, System</td></tr><tr id="bkmrk-input---addresses-fr"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-input">**Input**</th><td class="align-left" id="bkmrk---addresses-from-the">- Addresses from the Address Data Providers  
- Algorithm for the address data consolidation</td></tr><tr id="bkmrk-output---consolidate"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-output">**Output**</th><td class="align-left" id="bkmrk---consolidated-and-u">- Consolidated and up to date RNCV address database</td></tr></tbody></table>

## Detailed Process description

### Main process

<table class="simple-table" id="bkmrk-step-description-act" style="width: 100%;"><thead class="simple-table-header"><tr id="bkmrk-step-description-act-1"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-step" style="width: 5.721097%;">Step</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-description" style="width: 26.460072%;">Description</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-actor%28s%29" style="width: 8.581645%;">Actor(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-input%28s%29" style="width: 16.805721%;">Input(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-output%28s%29" style="width: 16.448294%;">Output(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-decision-points" style="width: 25.863982%;">Decision points</th></tr></thead><tbody><tr id="bkmrk-1-the-etl-process-is"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-1" style="width: 5.721097%;">**1**</th><td class="align-left" id="bkmrk-the-etl-process-is-p" style="width: 26.460072%;">The ETL process is periodically triggered</td><td class="align-left" id="bkmrk-etl" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk--" style="width: 16.805721%;">-</td><td class="align-left" id="bkmrk---1" style="width: 16.448294%;">-</td><td class="align-left" id="bkmrk--2" style="width: 25.863982%;"></td></tr><tr id="bkmrk-2-the-etl-process-re"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-2" style="width: 5.721097%;">**2**</th><td class="align-left" id="bkmrk-the-etl-process-retr" style="width: 26.460072%;">The ETL process retrieves the data to synchronise from the Address Data Providers</td><td class="align-left" id="bkmrk-etl-1" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---2" style="width: 16.805721%;">-</td><td class="align-left" id="bkmrk---addresses-to-synch" style="width: 16.448294%;">- addresses to synchronise</td><td class="align-left" id="bkmrk--3" style="width: 25.863982%;"></td></tr><tr id="bkmrk-3-the-etl-process-pr"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-3" style="width: 5.721097%;">**3**</th><td class="align-left" id="bkmrk-the-etl-process-proc" style="width: 26.460072%;">The ETL process process one address from the addresses to synchronise (could be done in parallel)</td><td class="align-left" id="bkmrk-etl-2" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---addresses-to-synch-1" style="width: 16.805721%;">- addresses to synchronise</td><td class="align-left" id="bkmrk---next-address-to-sy" style="width: 16.448294%;">- next address to synchronise</td><td class="align-left" id="bkmrk--4" style="width: 25.863982%;"></td></tr><tr id="bkmrk-4-the-etl-process-ex"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-4" style="width: 5.721097%;">**4**</th><td class="align-left" id="bkmrk-the-etl-process-extr" style="width: 26.460072%;">The ETL process extracts the address information to be stored in the RNCV address database</td><td class="align-left" id="bkmrk-etl-3" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---address-to-synchro" style="width: 16.805721%;">- address to synchronise</td><td class="align-left" id="bkmrk---address-informatio" style="width: 16.448294%;">- address information to be ingested</td><td class="align-left" id="bkmrk--5" style="width: 25.863982%;"></td></tr><tr id="bkmrk-5-the-etl-process-se"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-5" style="width: 5.721097%;">**5**</th><td class="align-left" id="bkmrk-the-etl-process-sear" style="width: 26.460072%;">The ETL process searches for address matches in the RNCV address database</td><td class="align-left" id="bkmrk-etl-4" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---address-informatio-1" style="width: 16.805721%;">- address information to be ingested</td><td class="align-left" id="bkmrk---address-present-in" style="width: 16.448294%;">- address present in the RNCV address database if any</td><td class="align-left" id="bkmrk--6" style="width: 25.863982%;"></td></tr><tr id="bkmrk-6-the-etl-process-ch"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-6" style="width: 5.721097%;">**6**</th><td class="align-left" id="bkmrk-the-etl-process-chec" style="width: 26.460072%;">The ETL process checks if the address is present in the RNCV address database</td><td class="align-left" id="bkmrk-etl-5" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---address-present-in-1" style="width: 16.805721%;">- address present in the RNCV address database if any</td><td class="align-left" id="bkmrk---yes-%2F-no" style="width: 16.448294%;">- yes / no</td><td class="align-left" id="bkmrk-if-the-address-is-pr" style="width: 25.863982%;">**If the address is present:**  
Go to step 7  
**Else:**  
Go to secondary process S.1.</td></tr><tr id="bkmrk-7-correct-address-in"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-7" style="width: 5.721097%;">**7**</th><td class="align-left" id="bkmrk-correct-address-info" style="width: 26.460072%;">Correct address information and set the flag “validated = true”</td><td class="align-left" id="bkmrk-etl-6" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---nrvc-address" style="width: 16.805721%;">- RNCV address</td><td class="align-left" id="bkmrk---corrected-nrvc-add" style="width: 16.448294%;">- Corrected RNCV address with flag “validated = true”</td><td class="align-left" id="bkmrk--7" style="width: 25.863982%;"></td></tr><tr id="bkmrk-8-system-applies-the"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-8" style="width: 5.721097%;">**8**</th><td class="align-left" id="bkmrk-system-applies-the-a" style="width: 26.460072%;">System applies the address update</td><td class="align-left" id="bkmrk-system" style="width: 8.581645%;">System</td><td class="align-left" id="bkmrk---corrected-nrvc-add-1" style="width: 16.805721%;">- Corrected RNCV address with flag “validated = true”</td><td class="align-left" id="bkmrk---corrected-nrvc-add-2" style="width: 16.448294%;">- Corrected RNCV address with flag “validated = true”</td><td class="align-left" id="bkmrk--8" style="width: 25.863982%;"></td></tr><tr id="bkmrk-9-the-etl-process-ch"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-9" style="width: 5.721097%;">**9**</th><td class="align-left" id="bkmrk-the-etl-process-chec-1" style="width: 26.460072%;">The ETL process checks if more addresses need to be synchronised</td><td class="align-left" id="bkmrk-etl-7" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---addresses-that-sti" style="width: 16.805721%;">- addresses that still need to be synchronised</td><td class="align-left" id="bkmrk---yes-%2F-no-1" style="width: 16.448294%;">- yes / no</td><td class="align-left" id="bkmrk-if-there-are-still-a" style="width: 25.863982%;">**If there are still addresses to be synchronised:** Go to step 3  
**Else:** Go to step 10</td></tr><tr id="bkmrk-10-the-etl-process-t"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-10" style="width: 5.721097%;">**10**</th><td class="align-left" id="bkmrk-the-etl-process-term" style="width: 26.460072%;">The ETL process terminates successfully</td><td class="align-left" id="bkmrk-etl-8" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---3" style="width: 16.805721%;">-</td><td class="align-left" id="bkmrk---4" style="width: 16.448294%;">-</td><td class="align-left">  
</td></tr></tbody></table>

### Secondary Processes

#### S.1. Address does not exist in the RNCV address database

<table class="simple-table" id="bkmrk-step-description-act-2" style="width: 100%;"><thead class="simple-table-header"><tr id="bkmrk-step-description-act-3"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-step-1" style="width: 5.959476%;">Step</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-description-1" style="width: 20.619785%;">Description</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-actor%28s%29-1" style="width: 8.581645%;">Actor(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-input%28s%29-1" style="width: 15.733015%;">Input(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-output%28s%29-1" style="width: 15.851779%;">Output(s)</th><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-decision-points-1" style="width: 33.254299%;">Decision points</th></tr></thead><tbody><tr id="bkmrk-1-the-etl-process-cr"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-1-1" style="width: 5.959476%;">**1**</th><td class="align-left" id="bkmrk-the-etl-process-crea" style="width: 20.619785%;">The ETL process creates a new Address with the flag “validated = true”</td><td class="align-left" id="bkmrk-etl-9" style="width: 8.581645%;">ETL</td><td class="align-left" id="bkmrk---address-informatio-2" style="width: 15.733015%;">- address information to be ingested</td><td class="align-left" id="bkmrk---nrvc-address-to-be" style="width: 15.851779%;">- RNCV address to be created</td><td class="align-left" id="bkmrk--9" style="width: 33.254299%;"></td></tr><tr id="bkmrk-2-the-system-creates"><th class="simple-table-header-color simple-table-header align-left" id="bkmrk-2-1" style="width: 5.959476%;">**2**</th><td class="align-left" id="bkmrk-the-system-creates-t" style="width: 20.619785%;">The system creates the given address</td><td class="align-left" id="bkmrk-system-1" style="width: 8.581645%;">System</td><td class="align-left" id="bkmrk---nrvc-address-to-be-1" style="width: 15.733015%;">- RNCV address to be created</td><td class="align-left" id="bkmrk---nrvc-created-nrvc" style="width: 15.851779%;">- RNCV address created </td><td class="align-left" id="bkmrk-go-to-main-process-s" style="width: 33.254299%;">Go to Main process step 9</td></tr></tbody></table>

## Additional Information

### Error processing during the ETL process

If an error occurs during the ETL process (internal error, or error while using an external API), the system should log the error and process the next address. An error triggered during the processing of one address should never interrupting the ETL process for subsequent addresses, except if it is a system wide error, that would prevent all addresses from being processed.