Last update: 20.08.2022
Contact : fdrs@ifrc.org or simon.weiss@ifrc.org
FDRS focal point: Simon Weiss, FDRS Data Analyst

How to deal with Missing Values? FDRS imputation methodology in python¶


The FDRS is ambitious and wide reaching. Although the data quality and reporting are improving each year, data is missing for some National Societies. As a result, some data fluctuations may be misleading: trend lines can drop for a given year when there is missing data, and some National Societies are excluded from the total and then appear again in another year.In order to better represent the network and better count everyone, FDRS implement every year data imputation techniques.

The purpose of this notebook is to apply the method selected by the FDRS team and to interact with the FDRS backoffice with a post method to publish the imputed values. The ingested data is replicated and displayed on the website https://data.ifrc.org/FDRS/ and used in FDRS research such as Everyone count report.

Therefore this is not a research notebook but a production notebook. It aims to simply expose the FDRS methodology of imputation step by step. A related python script has been created: imputing.py which allows to launch this methodology once.

The approach chosen was to replace the 2019, 2020 and 2021 missing data as well as to apply two different techniques according to the indicator categories, in the previous years all NSs reported their data then no input technique was employed.

The imputing applies only to main indicators and does not apply to disaggregated levels to maintain consistency across years. A detailed description of the methodology is available in the pdf Missing Data

This notebook contains four sections

  • Get FDRS Data
  • Run imputing missing values
  • Optimization (optional)
  • Post method - interact FDRS BE

We'll get started by loading the python modules in our analysis toolkit.

0.Import package¶

In [ ]:
# data analysis and wrangling
import pandas as pd
import numpy as np
import requests, json
from pandas import json_normalize
import json
from datetime import datetime
import os
import sys
import warnings
warnings.filterwarnings('ignore')

1. Get Data¶


1.1 Import modules¶

In [ ]:
#import api functions to get data
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path+"\\src\data")
from fdrsapi import api_function,baseline, api_function_imputed
In [ ]:
#generate fdrs codebook and store it in references folder
KPI = requests.get(f"https://data-api.ifrc.org/api/indicator?apiKey=21e401ae-6b35-404b-a72a-b74cce66dee3").json()
KPI = json_normalize(data=KPI)
#export KPI to excel in references folder
KPI.to_excel(r"..\references\codebook.xlsx")
In [ ]:
#import codebook 
codebook = pd.read_excel(r"..\references\codebook.xlsx")
#select only  NS Reach CPD KPIs
kpi=(codebook.query('KPI_Note== "NS Reach - CPD"')["KPI_Code"]).to_list()
#add manualy the other KPIS
kpi_code=kpi+["KPI_TrainFA_Tot","KPI_DonBlood_Tot"]+["KPI_GB_Tot","KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_noLocalUnits","KPI_IncomeLC_CHF" ,"KPI_expenditureLC_CHF"]+["validated"]+["KPI_WasSubmitted"]
In [ ]:
#counting nb of considered KPIS + removing metadata kpis 
print("number of total KPI considered:", len(kpi_code)-2)
number of total KPI considered: 23

New indicators: remove

In [ ]:
#remove the 2021 new indicators KPI collected from kpi_code
old_kpi=kpi_code.copy()
new_kpi=["KPI_Climate_CPD", "KPI_ClimateHeat_CPD", "KPI_ReachHI_CPD", "KPI_ReachHPM_CPD","KPI_ReachRCRCEd_CPD"]
for x in new_kpi:
    old_kpi.remove(x)
In [ ]:
#check  final kpi list
old_kpi
print("number of old KPI considered:", len(old_kpi))
number of old KPI considered: 20

1.2 Get data from FDRS API¶

In [ ]:
#read public api key from config file
f = open(module_path+"/src/config/public_api_key.txt")
api_key=f.readline()
#select years to download data from backoffice
years=["2018","2019","2020","2021"]
kpi_float=old_kpi.copy()
#run api function for each year
time_series=api_function(years,old_kpi,kpi_float,api_key)
time_series
2018
2019
2020
2021
number of KPI downloaded: 20
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot KPI_GB_Tot KPI_IncomeLC_CHF KPI_PStaff_Tot ... KPI_ReachL_CPD KPI_ReachM_CPD KPI_ReachSI_CPD KPI_ReachS_CPD KPI_ReachWASH_CPD KPI_TrainFA_Tot KPI_WasSubmitted KPI_expenditureLC_CHF KPI_noLocalUnits validated
0 DAF0012018 DAF001 Afghan Red Crescent Asia Pacific AFG 2018 0.0 15.0 1.803370e+07 2195.0 ... NaN NaN NaN NaN NaN 1020.0 1.0 1.847249e+07 122.0 1.0
1 DAL0012018 DAL001 Albanian Red Cross Europe and Central Asia ALB 2018 6531.0 14.0 1.555610e+06 53.0 ... 12800.0 0.0 34150.0 0.0 47000.0 62709.0 1.0 1.348786e+06 39.0 1.0
2 DDZ0012018 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2018 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN 1.0 NaN 248.0 1.0
3 DUS0012018 DUS001 American Red Cross Americas USA 2018 2662986.0 15.0 3.656908e+09 19649.0 ... 2308469.0 0.0 0.0 0.0 0.0 2283480.0 1.0 3.193028e+09 263.0 1.0
4 DAD0012018 DAD001 Andorran Red Cross Europe and Central Asia AND 2018 1084.0 9.0 1.177623e+06 20.0 ... NaN NaN 1117.0 NaN NaN 2901.0 1.0 1.379745e+06 1.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
763 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
764 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 1304191.0 5.0 NaN 16008.0 ... NaN NaN NaN NaN 141286.0 26811.0 1.0 NaN 12693.0 NaN
765 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 NaN 6.0 6.513051e+06 454.0 ... 538645.0 576219.0 21534.0 99003.0 419470.0 3404.0 1.0 5.273168e+06 32.0 1.0
766 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 NaN 10.0 2.930602e+06 66.0 ... 1249.0 4807.0 NaN 169.0 238278.0 1720.0 1.0 2.896418e+06 59.0 1.0
767 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 NaN 17.0 NaN 147.0 ... 32163.0 1966.0 98386.0 62.0 77623.0 27376.0 1.0 NaN 132.0 1.0

768 rows × 26 columns

1.3 Deal with 2021 New Indicators¶

Deal with new indicators: <!> To update everytime we run the script

  • No values at Kpivalue? for the 2021 FDRS new values, only available at data? endpoint
  • From Powerbi pipeline set up a excel export called "FDRS2021NewKPIValues"
  • Applied some power query transformation (KPI names reverted back to KPI code, selected Year)
  • Moved it in the fdrs_imputing_missing_values repo
  • Perfomed Merge Statement in pandas Left join on iso and years
  • time_series completed with new value
In [ ]:
#read excel from data/raw folder
new_time_series=pd.read_excel(module_path+"/data/raw/FDRS2021NewKPIValues.xlsx")
In [ ]:
#print new_time_series to check
new_time_series
Out[ ]:
iso_3 KPI_Year KPI_ReachRCRCEd_CPD KPI_Climate_CPD KPI_ClimateHeat_CPD KPI_ReachHI_CPD KPI_ReachHPM_CPD
0 AFG 2018 NaN NaN NaN NaN NaN
1 AFG 2019 NaN NaN NaN NaN NaN
2 AFG 2020 NaN NaN NaN NaN NaN
3 AFG 2021 105.0 NaN NaN 455730.0 4646.0
4 AGO 2018 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ...
674 ZMB 2020 NaN NaN NaN NaN NaN
675 ZMB 2021 0.0 0.0 NaN 0.0 382.0
676 ZWE 2018 NaN NaN NaN NaN NaN
677 ZWE 2019 NaN NaN NaN NaN NaN
678 ZWE 2020 NaN NaN NaN NaN NaN

679 rows × 7 columns

In [ ]:
#merge new time series with old time series
time_series=time_series.merge(new_time_series, how='left', on=['iso_3','KPI_Year'])
In [ ]:
#print to check
time_series
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot KPI_GB_Tot KPI_IncomeLC_CHF KPI_PStaff_Tot ... KPI_TrainFA_Tot KPI_WasSubmitted KPI_expenditureLC_CHF KPI_noLocalUnits validated KPI_ReachRCRCEd_CPD KPI_Climate_CPD KPI_ClimateHeat_CPD KPI_ReachHI_CPD KPI_ReachHPM_CPD
0 DAF0012018 DAF001 Afghan Red Crescent Asia Pacific AFG 2018 0.0 15.0 1.803370e+07 2195.0 ... 1020.0 1.0 1.847249e+07 122.0 1.0 NaN NaN NaN NaN NaN
1 DAL0012018 DAL001 Albanian Red Cross Europe and Central Asia ALB 2018 6531.0 14.0 1.555610e+06 53.0 ... 62709.0 1.0 1.348786e+06 39.0 1.0 NaN NaN NaN NaN NaN
2 DDZ0012018 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2018 NaN NaN NaN NaN ... NaN 1.0 NaN 248.0 1.0 NaN NaN NaN NaN NaN
3 DUS0012018 DUS001 American Red Cross Americas USA 2018 2662986.0 15.0 3.656908e+09 19649.0 ... 2283480.0 1.0 3.193028e+09 263.0 1.0 NaN NaN NaN NaN NaN
4 DAD0012018 DAD001 Andorran Red Cross Europe and Central Asia AND 2018 1084.0 9.0 1.177623e+06 20.0 ... 2901.0 1.0 1.379745e+06 1.0 0.0 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
763 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
764 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 1304191.0 5.0 NaN 16008.0 ... 26811.0 1.0 NaN 12693.0 NaN NaN NaN NaN NaN NaN
765 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 NaN 6.0 6.513051e+06 454.0 ... 3404.0 1.0 5.273168e+06 32.0 1.0 18300.0 NaN NaN 41879.0 1769102.0
766 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 NaN 10.0 2.930602e+06 66.0 ... 1720.0 1.0 2.896418e+06 59.0 1.0 0.0 0.0 NaN 0.0 382.0
767 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 NaN 17.0 NaN 147.0 ... 27376.0 1.0 NaN 132.0 1.0 NaN NaN NaN NaN NaN

768 rows × 31 columns

1.4 Pre-processing for the imputation : validate previous years data¶

We Consider that previous year data has been validated

In [ ]:
#count nb of validated submission per years
time_series.query('KPI_Year == 2019')["validated"].sum(),time_series.query('KPI_Year == 2020')["validated"].sum(), time_series.query('KPI_Year == 2021')["validated"].sum()
Out[ ]:
(158.0, 167.0, 45.0)
In [ ]:
##replace validated =0 for KPI year 2020,2019,2018 with validated =1
time_series.loc[(time_series["KPI_Year"]==2020) & (time_series["validated"]==0), "validated"]=1
time_series.loc[(time_series["KPI_Year"]==2019) & (time_series["validated"]==0), "validated"]=1
time_series.loc[(time_series["KPI_Year"]==2018) & (time_series["validated"]==0), "validated"]=1

2. Run imputing missing values¶


2.1 Import modules¶

In [ ]:
#import imputing function 
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path+"\\src\data")
from imp_function import imputing_mean,imputing_na

2.2 Run imputing functions¶

In [ ]:
data_clean=time_series.copy()


# 1st subset:before 2018
data_clean1=data_clean.query('KPI_Year >= 2012 & KPI_Year < 2018').copy()

# 2nd subset: after 2018 and validated
data_clean2=data_clean.query('KPI_Year >= 2018 & KPI_Year < 2022 & validated == 1').copy()



# 3rd subset: after 2018 and all data; idea : imputing all then filter out the validated data
data_clean3=data_clean.query('KPI_Year >= 2018').copy()
data_clean3.loc[(data_clean3["KPI_Year"]==2021) & (data_clean3["validated"]==0.0), kpi_code]=np.nan

# Imputing : NS Reach CPD KPIs
data_clean3_na_columns=data_clean3.copy()
#import codebook 
codebook = pd.read_excel(r"..\references\codebook.xlsx")
#select only NS Reach CPD KPIs
kpi=(codebook.query('KPI_Note== "NS Reach - CPD"')["KPI_Code"]).to_list()
#add manualy the other Reach bKPIS
kpi_reach_code=kpi+["KPI_TrainFA_Tot","KPI_DonBlood_Tot"]
selected_cols=kpi_reach_code
imputing_mean(data_clean3_na_columns,selected_cols,data_clean3)


# Imputing : NS Finance and Parnership & NS Governance and Structure Section
kpi_gov_fi_code=["KPI_GB_Tot","KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_noLocalUnits","KPI_IncomeLC_CHF" ,"KPI_expenditureLC_CHF"]
selected_cols_2d=kpi_gov_fi_code.copy()

# 18.11.21Comment : After reviewing with the Team, "supported1" and "received_support1" have been imputed seperatly in the script FDRS Network Data Transformation
imputing_na(data_clean3_na_columns,selected_cols_2d)


data_clean3_na_columns=data_clean3_na_columns.query('KPI_Year >= 2018 & KPI_Year < 2022 & validated != 1').copy()

# concatenate the two sub_datasets
fdrs_na_columns = [data_clean1, data_clean2, data_clean3_na_columns]
fdrs_data_fdrs_na_columns = pd.concat(fdrs_na_columns)
KPI_Climate_CPD
KPI_ClimateHeat_CPD
KPI_ReachCTP_CPD
KPI_ReachDRER_CPD
KPI_ReachDRR_CPD
KPI_ReachH_CPD
KPI_ReachHI_CPD
KPI_ReachHPM_CPD
KPI_ReachL_CPD
KPI_ReachLTSPD_CPD
KPI_ReachM_CPD
KPI_ReachRCRCEd_CPD
KPI_ReachS_CPD
KPI_ReachSI_CPD
KPI_ReachWASH_CPD
KPI_TrainFA_Tot
KPI_DonBlood_Tot
KPI_GB_Tot
KPI_PeopleVol_Tot
KPI_PStaff_Tot
KPI_noLocalUnits
KPI_IncomeLC_CHF
KPI_expenditureLC_CHF

Quick check for sum in Vol after imputing

In [ ]:
fdrs_data_fdrs_na_columns[fdrs_data_fdrs_na_columns["KPI_Year"]==2021][["NSO_DON_name","KPI_Year","KPI_PeopleVol_Tot"]].sum()
Out[ ]:
NSO_DON_name         Austrian Red CrossBaphalali Eswatini Red Cross...
KPI_Year                                                        388032
KPI_PeopleVol_Tot                                           15061782.0
dtype: object
In [ ]:
fdrs_data_fdrs_na_columns[fdrs_data_fdrs_na_columns["KPI_Year"]==2020][["NSO_DON_name","KPI_Year","KPI_PeopleVol_Tot"]].sum()
Out[ ]:
NSO_DON_name         Afghan Red CrescentAlbanian Red CrossAlgerian ...
KPI_Year                                                        387840
KPI_PeopleVol_Tot                                           14525876.0
dtype: object

2.3 Export imputed values in Excel¶

Quick process to ingest imputed values in an excel read by PBI

In [ ]:
#generate kpi_code_ip names from kpi_code
kpi_code=kpi_reach_code+kpi_gov_fi_code
kpi_code_ip= [s + "_IP" for s in kpi_code]
In [ ]:
#print original dataset 
fdrs_data_fdrs_na_columns
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot KPI_GB_Tot KPI_IncomeLC_CHF KPI_PStaff_Tot ... KPI_TrainFA_Tot KPI_WasSubmitted KPI_expenditureLC_CHF KPI_noLocalUnits validated KPI_ReachRCRCEd_CPD KPI_Climate_CPD KPI_ClimateHeat_CPD KPI_ReachHI_CPD KPI_ReachHPM_CPD
0 DAF0012018 DAF001 Afghan Red Crescent Asia Pacific AFG 2018 0.0 15.0 1.803370e+07 2195.0 ... 1.020000e+03 1.0 1.847249e+07 122.0 1.0 NaN NaN NaN NaN NaN
1 DAL0012018 DAL001 Albanian Red Cross Europe and Central Asia ALB 2018 6531.0 14.0 1.555610e+06 53.0 ... 6.270900e+04 1.0 1.348786e+06 39.0 1.0 NaN NaN NaN NaN NaN
2 DDZ0012018 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2018 NaN NaN NaN NaN ... NaN 1.0 NaN 248.0 1.0 NaN NaN NaN NaN NaN
3 DUS0012018 DUS001 American Red Cross Americas USA 2018 2662986.0 15.0 3.656908e+09 19649.0 ... 2.283480e+06 1.0 3.193028e+09 263.0 1.0 NaN NaN NaN NaN NaN
4 DAD0012018 DAD001 Andorran Red Cross Europe and Central Asia AND 2018 1084.0 9.0 1.177623e+06 20.0 ... 2.901000e+03 1.0 1.379745e+06 1.0 1.0 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
760 DUA0012021 DUA001 Ukrainian Red Cross Society Europe and Central Asia UKR 2021 809.5 75.0 1.009907e+07 535.0 ... 3.889000e+04 NaN 1.050011e+07 224.0 NaN NaN NaN NaN NaN NaN
761 DUY0012021 DUY001 Uruguayan Red Cross Americas URY 2021 12.0 5.0 1.000076e+06 52.0 ... 1.138667e+03 NaN 9.840340e+05 19.0 NaN NaN NaN NaN NaN NaN
762 DVU0012021 DVU001 Vanuatu Red Cross Society Asia Pacific VUT 2021 0.0 10.0 7.895120e+05 37.0 ... 6.026667e+02 NaN 9.459050e+05 7.0 NaN NaN NaN NaN NaN NaN
763 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 317.5 11.0 0.000000e+00 742.0 ... 6.533000e+03 NaN 0.000000e+00 42.0 NaN NaN NaN NaN NaN NaN
764 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 1304191.0 5.0 5.918624e+06 16008.0 ... 2.681100e+04 1.0 2.976457e+06 12693.0 NaN NaN NaN NaN NaN NaN

768 rows × 31 columns

In [ ]:
kpi_code_ip
Out[ ]:
['KPI_Climate_CPD_IP',
 'KPI_ClimateHeat_CPD_IP',
 'KPI_ReachCTP_CPD_IP',
 'KPI_ReachDRER_CPD_IP',
 'KPI_ReachDRR_CPD_IP',
 'KPI_ReachH_CPD_IP',
 'KPI_ReachHI_CPD_IP',
 'KPI_ReachHPM_CPD_IP',
 'KPI_ReachL_CPD_IP',
 'KPI_ReachLTSPD_CPD_IP',
 'KPI_ReachM_CPD_IP',
 'KPI_ReachRCRCEd_CPD_IP',
 'KPI_ReachS_CPD_IP',
 'KPI_ReachSI_CPD_IP',
 'KPI_ReachWASH_CPD_IP',
 'KPI_TrainFA_Tot_IP',
 'KPI_DonBlood_Tot_IP',
 'KPI_GB_Tot_IP',
 'KPI_PeopleVol_Tot_IP',
 'KPI_PStaff_Tot_IP',
 'KPI_noLocalUnits_IP',
 'KPI_IncomeLC_CHF_IP',
 'KPI_expenditureLC_CHF_IP']
In [ ]:
#rename list of columns kpi_codes to kpi_codes_ip
fdrs_data_fdrs_na_columns.rename(columns=dict(zip(kpi_code, kpi_code_ip)), inplace=True)
In [ ]:
#print dataset with new names
fdrs_data_fdrs_na_columns
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot_IP KPI_GB_Tot_IP KPI_IncomeLC_CHF_IP KPI_PStaff_Tot_IP ... KPI_TrainFA_Tot_IP KPI_WasSubmitted KPI_expenditureLC_CHF_IP KPI_noLocalUnits_IP validated KPI_ReachRCRCEd_CPD_IP KPI_Climate_CPD_IP KPI_ClimateHeat_CPD_IP KPI_ReachHI_CPD_IP KPI_ReachHPM_CPD_IP
0 DAF0012018 DAF001 Afghan Red Crescent Asia Pacific AFG 2018 0.0 15.0 1.803370e+07 2195.0 ... 1.020000e+03 1.0 1.847249e+07 122.0 1.0 NaN NaN NaN NaN NaN
1 DAL0012018 DAL001 Albanian Red Cross Europe and Central Asia ALB 2018 6531.0 14.0 1.555610e+06 53.0 ... 6.270900e+04 1.0 1.348786e+06 39.0 1.0 NaN NaN NaN NaN NaN
2 DDZ0012018 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2018 NaN NaN NaN NaN ... NaN 1.0 NaN 248.0 1.0 NaN NaN NaN NaN NaN
3 DUS0012018 DUS001 American Red Cross Americas USA 2018 2662986.0 15.0 3.656908e+09 19649.0 ... 2.283480e+06 1.0 3.193028e+09 263.0 1.0 NaN NaN NaN NaN NaN
4 DAD0012018 DAD001 Andorran Red Cross Europe and Central Asia AND 2018 1084.0 9.0 1.177623e+06 20.0 ... 2.901000e+03 1.0 1.379745e+06 1.0 1.0 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
760 DUA0012021 DUA001 Ukrainian Red Cross Society Europe and Central Asia UKR 2021 809.5 75.0 1.009907e+07 535.0 ... 3.889000e+04 NaN 1.050011e+07 224.0 NaN NaN NaN NaN NaN NaN
761 DUY0012021 DUY001 Uruguayan Red Cross Americas URY 2021 12.0 5.0 1.000076e+06 52.0 ... 1.138667e+03 NaN 9.840340e+05 19.0 NaN NaN NaN NaN NaN NaN
762 DVU0012021 DVU001 Vanuatu Red Cross Society Asia Pacific VUT 2021 0.0 10.0 7.895120e+05 37.0 ... 6.026667e+02 NaN 9.459050e+05 7.0 NaN NaN NaN NaN NaN NaN
763 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 317.5 11.0 0.000000e+00 742.0 ... 6.533000e+03 NaN 0.000000e+00 42.0 NaN NaN NaN NaN NaN NaN
764 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 1304191.0 5.0 5.918624e+06 16008.0 ... 2.681100e+04 1.0 2.976457e+06 12693.0 NaN NaN NaN NaN NaN NaN

768 rows × 31 columns

In [ ]:
#export imputed values in an excel file. 
#to change: staging in prod when final
#fdrs_data_fdrs_na_columns.to_excel("C:/Users/simon.weiss/OneDrive - IFRC/Tech talk and files/1_PowerBI/1_FDRS/0.Data_Sources/1_FDRS_Wide_Format_IP_ALL_Staging.xlsx",index=False)
#fdrs_data_fdrs_na_columns.to_excel("C:/Users/simon.weiss/OneDrive - IFRC/Tech talk and files/1_PowerBI/1_FDRS/0.Data_Sources/1_FDRS_Wide_Format_IP_ALL_Staging.xlsx",index=False)

3. Optimization: Get imputed values t-1 from backoffice to compare with imputed values t (optional)¶


In [ ]:
KPI = requests.get(f"https://data-api.ifrc.org/api/indicator?apiKey=21e401ae-6b35-404b-a72a-b74cce66dee3").json()
KPI = json_normalize(data=KPI)
kpi=(KPI.query('KPI_Note== "NS Reach - CPD"')["KPI_Code"]).to_list()
kpi_code=kpi+["KPI_TrainFA_Tot","KPI_DonBlood_Tot"]+["KPI_GB_Tot","KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_noLocalUnits","KPI_IncomeLC_CHF" ,"KPI_expenditureLC_CHF"]
In [ ]:
new_kpi=["KPI_Climate_CPD", "KPI_ClimateHeat_CPD", "KPI_ReachHI_CPD", "KPI_ReachHPM_CPD","KPI_ReachRCRCEd_CPD"]
for x in new_kpi:
    kpi_code.remove(x)
In [ ]:
kpi_code= [s + "_IP" for s in kpi_code]
kpi_float=kpi_code.copy()

Get imputed values from backoffice

In [ ]:
years=["2021"]
In [ ]:
time_series_imputed=api_function_imputed(years,kpi_code,kpi_float,api_key)
time_series_imputed
2021
number of KPI downloaded: 18
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year nan KPI_DonBlood_Tot_IP KPI_GB_Tot_IP KPI_IncomeLC_CHF_IP ... KPI_ReachH_CPD_IP KPI_ReachLTSPD_CPD_IP KPI_ReachL_CPD_IP KPI_ReachM_CPD_IP KPI_ReachSI_CPD_IP KPI_ReachS_CPD_IP KPI_ReachWASH_CPD_IP KPI_TrainFA_Tot_IP KPI_expenditureLC_CHF_IP KPI_noLocalUnits_IP
0 DAF0012021 DAF001 Afghan Red Crescent Asia Pacific AFG 2021 NaN NaN NaN 1.114095e+07 ... 2145871.0 2145871.0 219872.0 NaN NaN NaN 0.0 3182.0 1.156653e+07 122.0
1 DAL0012021 DAL001 Albanian Red Cross Europe and Central Asia ALB 2021 NaN NaN 14.0 2.270842e+06 ... NaN NaN NaN NaN NaN NaN NaN NaN 1.913412e+06 39.0
2 DDZ0012021 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 DUS0012021 DUS001 American Red Cross Americas USA 2021 NaN 2218346.0 14.0 2.787293e+09 ... 6339537.0 7919326.0 244095.0 NaN NaN 1512380.0 NaN 6190964.0 2.521967e+09 234.0
4 DAD0012021 DAD001 Andorran Red Cross Europe and Central Asia AND 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
187 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
188 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
189 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 NaN NaN 6.0 6.513051e+06 ... 2805849.0 2449937.0 538645.0 576219.0 21534.0 99003.0 419470.0 3404.0 5.273168e+06 32.0
190 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 NaN NaN 10.0 2.930602e+06 ... 3414257.0 14776.0 1249.0 4807.0 NaN 169.0 238278.0 1720.0 2.896418e+06 2.0
191 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

192 rows × 25 columns

In [ ]:
time_series_imputed.drop("nan",axis=1,inplace=True)
In [ ]:
new_time_series_ip=pd.read_excel("FDRS2021NewKPIValues.xlsx")
In [ ]:
# renaming the column by index
new_time_series_ip.columns.values[2:7] =["KPI_ReachRCRCEd_CPD_IP", "KPI_Climate_CPD_IP","KPI_ClimateHeat_CPD_IP","KPI_ReachHI_CPD_IP","KPI_ReachHPM_CPD_IP"]
In [ ]:
time_series_imputed=time_series_imputed.merge(new_time_series_ip, how='left', on=['iso_3','KPI_Year'])
time_series_imputed
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot_IP KPI_GB_Tot_IP KPI_IncomeLC_CHF_IP KPI_PStaff_Tot_IP ... KPI_ReachS_CPD_IP KPI_ReachWASH_CPD_IP KPI_TrainFA_Tot_IP KPI_expenditureLC_CHF_IP KPI_noLocalUnits_IP KPI_ReachRCRCEd_CPD_IP KPI_Climate_CPD_IP KPI_ClimateHeat_CPD_IP KPI_ReachHI_CPD_IP KPI_ReachHPM_CPD_IP
0 DAF0012021 DAF001 Afghan Red Crescent Asia Pacific AFG 2021 NaN NaN 1.114095e+07 2309.0 ... NaN 0.0 3182.0 1.156653e+07 122.0 105.0 NaN NaN 455730.0 4646.0
1 DAL0012021 DAL001 Albanian Red Cross Europe and Central Asia ALB 2021 NaN 14.0 2.270842e+06 58.0 ... NaN NaN NaN 1.913412e+06 39.0 NaN NaN NaN NaN NaN
2 DDZ0012021 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 DUS0012021 DUS001 American Red Cross Americas USA 2021 2218346.0 14.0 2.787293e+09 17495.0 ... 1512380.0 NaN 6190964.0 2.521967e+09 234.0 NaN NaN NaN NaN 38110.0
4 DAD0012021 DAD001 Andorran Red Cross Europe and Central Asia AND 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
187 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
188 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
189 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 NaN 6.0 6.513051e+06 454.0 ... 99003.0 419470.0 3404.0 5.273168e+06 32.0 18300.0 NaN NaN 41879.0 1769102.0
190 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 NaN 10.0 2.930602e+06 66.0 ... 169.0 238278.0 1720.0 2.896418e+06 2.0 0.0 0.0 NaN 0.0 382.0
191 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

192 rows × 29 columns

Code the values that are different.
0 for values that do not need to be updated,
1 for values that need to be updated.

In [ ]:
KPI = requests.get(f"https://data-api.ifrc.org/api/indicator?apiKey=21e401ae-6b35-404b-a72a-b74cce66dee3").json()
KPI = json_normalize(data=KPI)
kpi=(KPI.query('KPI_Note== "NS Reach - CPD"')["KPI_Code"]).to_list()
kpi_code=kpi+["KPI_TrainFA_Tot","KPI_DonBlood_Tot"]+["KPI_GB_Tot","KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_noLocalUnits","KPI_IncomeLC_CHF" ,"KPI_expenditureLC_CHF"]
In [ ]:
#kpi_code=["KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_DonBlood_Tot","KPI_TrainFA_Tot","KPI_noLocalUnits","KPI_ReachDRER_CPD","KPI_ReachLTSPD_CPD","KPI_ReachDRR_CPD","KPI_ReachS_CPD","KPI_ReachL_CPD","KPI_ReachH_CPD","KPI_ReachWASH_CPD","KPI_ReachM_CPD","KPI_ReachCTP_CPD","KPI_ReachSI_CPD","KPI_IncomeLC_CHF","KPI_expenditureLC_CHF"]
#kpi_code=["KPI_PeopleVol_Tot"]

# Convert to type int to remove decimals from calculations
#fdrs_data_fdrs_na_columns[kpi_code]=fdrs_data_fdrs_na_columns[kpi_code].fillna(0).astype('int')
#fdrs_data_fdrs_na_columns[kpi_code]=fdrs_data_fdrs_na_columns[kpi_code].replace(0,np.nan)
In [ ]:
time_series_imputed
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot_IP KPI_GB_Tot_IP KPI_IncomeLC_CHF_IP KPI_PStaff_Tot_IP ... KPI_ReachS_CPD_IP KPI_ReachWASH_CPD_IP KPI_TrainFA_Tot_IP KPI_expenditureLC_CHF_IP KPI_noLocalUnits_IP KPI_ReachRCRCEd_CPD_IP KPI_Climate_CPD_IP KPI_ClimateHeat_CPD_IP KPI_ReachHI_CPD_IP KPI_ReachHPM_CPD_IP
0 DAF0012021 DAF001 Afghan Red Crescent Asia Pacific AFG 2021 NaN NaN 1.114095e+07 2309.0 ... NaN 0.0 3182.0 1.156653e+07 122.0 105.0 NaN NaN 455730.0 4646.0
1 DAL0012021 DAL001 Albanian Red Cross Europe and Central Asia ALB 2021 NaN 14.0 2.270842e+06 58.0 ... NaN NaN NaN 1.913412e+06 39.0 NaN NaN NaN NaN NaN
2 DDZ0012021 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 DUS0012021 DUS001 American Red Cross Americas USA 2021 2218346.0 14.0 2.787293e+09 17495.0 ... 1512380.0 NaN 6190964.0 2.521967e+09 234.0 NaN NaN NaN NaN 38110.0
4 DAD0012021 DAD001 Andorran Red Cross Europe and Central Asia AND 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
187 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
188 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
189 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 NaN 6.0 6.513051e+06 454.0 ... 99003.0 419470.0 3404.0 5.273168e+06 32.0 18300.0 NaN NaN 41879.0 1769102.0
190 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 NaN 10.0 2.930602e+06 66.0 ... 169.0 238278.0 1720.0 2.896418e+06 2.0 0.0 0.0 NaN 0.0 382.0
191 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

192 rows × 29 columns

In [ ]:
fdrs_data_fdrs_na_columns
Out[ ]:
KPI ID KPI_DON_code NSO_DON_name NSO_ZON_name iso_3 KPI_Year KPI_DonBlood_Tot KPI_GB_Tot KPI_IncomeLC_CHF KPI_PStaff_Tot ... KPI_ReachS_CPD KPI_ReachWASH_CPD KPI_TrainFA_Tot KPI_expenditureLC_CHF KPI_noLocalUnits KPI_ReachRCRCEd_CPD KPI_Climate_CPD KPI_ClimateHeat_CPD KPI_ReachHI_CPD KPI_ReachHPM_CPD
0 DAF0012018 DAF001 Afghan Red Crescent Asia Pacific AFG 2018 0.000000e+00 15.0 1.803370e+07 2195.0 ... NaN 60000.0 1.020000e+03 1.847249e+07 122.0 105.0 NaN NaN 455730.0 4646.0
1 DAL0012018 DAL001 Albanian Red Cross Europe and Central Asia ALB 2018 6.531000e+03 14.0 1.555610e+06 53.0 ... 0.000000 47000.0 6.270900e+04 1.348786e+06 39.0 NaN NaN NaN NaN NaN
2 DDZ0012018 DDZ001 Algerian Red Crescent Middle East and North Africa DZA 2018 NaN NaN NaN NaN ... 10000.000000 NaN NaN NaN 248.0 NaN NaN NaN NaN NaN
3 DUS0012018 DUS001 American Red Cross Americas USA 2018 2.662986e+06 15.0 3.656908e+09 19649.0 ... 0.000000 0.0 2.283480e+06 3.193028e+09 263.0 NaN NaN NaN NaN 38110.0
4 DAD0012018 DAD001 Andorran Red Cross Europe and Central Asia AND 2018 1.084000e+03 9.0 1.177623e+06 20.0 ... 0.000000 0.0 2.901000e+03 1.379745e+06 1.0 NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
763 DVE0012021 DVE001 Venezuelan Red Cross Americas VEN 2021 3.175000e+02 11.0 0.000000e+00 742.0 ... 170.500000 126319.0 6.533000e+03 0.000000e+00 42.0 NaN NaN NaN NaN NaN
764 DVN0012021 DVN001 Vietnam Red Cross Society Asia Pacific VNM 2021 1.383143e+06 5.0 5.918624e+06 16806.0 ... 19931.333333 164013.0 1.301043e+05 2.976457e+06 17127.0 NaN NaN NaN NaN NaN
765 DYE0012021 DYE001 Yemen Red Crescent Society Middle East and North Africa YEM 2021 0.000000e+00 6.0 6.513051e+06 454.0 ... 99003.000000 419470.0 3.404000e+03 5.273168e+06 32.0 18300.0 NaN NaN 41879.0 1769102.0
766 DZM0012021 DZM001 Zambia Red Cross Society Africa ZMB 2021 0.000000e+00 10.0 2.930602e+06 66.0 ... 169.000000 238278.0 1.720000e+03 2.896418e+06 59.0 0.0 0.0 NaN 0.0 382.0
767 DZW0012021 DZW001 Zimbabwe Red Cross Society Africa ZWE 2021 0.000000e+00 16.0 3.674074e+06 150.0 ... 5383.000000 287572.0 1.072267e+04 3.380148e+06 157.0 NaN NaN NaN NaN NaN

768 rows × 29 columns

In [ ]:
kpi_code_imp= [s + "_IP" for s in kpi_code]
kpi_code_binary = [s + "_binary" for s in kpi_code]
In [ ]:
#kpi_code=["KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_DonBlood_Tot","KPI_TrainFA_Tot","KPI_noLocalUnits","KPI_ReachDRER_CPD","KPI_ReachLTSPD_CPD","KPI_ReachDRR_CPD","KPI_ReachS_CPD","KPI_ReachL_CPD","KPI_ReachH_CPD","KPI_ReachWASH_CPD","KPI_ReachM_CPD","KPI_ReachCTP_CPD","KPI_ReachSI_CPD","KPI_IncomeLC_CHF","KPI_expenditureLC_CHF"]



for kpi in list(kpi_code):
    kpi_code_imp=kpi+"_IP"
    kpi_code_binary=kpi+"_binary"
    fdrs_data_fdrs_na_columns.loc[time_series_imputed[kpi_code_imp] == fdrs_data_fdrs_na_columns[kpi],kpi_code_binary]=0 #0 if same
    fdrs_data_fdrs_na_columns.loc[time_series_imputed[kpi_code_imp] != fdrs_data_fdrs_na_columns[kpi],kpi_code_binary]=1 #1 if different
    
    fdrs_data_fdrs_na_columns.loc[fdrs_data_fdrs_na_columns[kpi].isna(),kpi_code_binary]=2
    #To be uncommented when the data is ready
    fdrs_data_fdrs_na_columns.loc[fdrs_data_fdrs_na_columns[kpi_code_binary]==2,kpi_code_binary]=0

            
In [ ]:
kpi_code_imp
Out[ ]:
'KPI_Climate_CPD_IP'
In [ ]:
print(time_series_imputed[kpi_code_imp].dtypes)
KPI_Climate_CPD_IP          float64
KPI_ClimateHeat_CPD_IP      float64
KPI_ReachCTP_CPD_IP         float64
KPI_ReachDRER_CPD_IP        float64
KPI_ReachDRR_CPD_IP         float64
KPI_ReachH_CPD_IP           float64
KPI_ReachHI_CPD_IP          float64
KPI_ReachHPM_CPD_IP         float64
KPI_ReachL_CPD_IP           float64
KPI_ReachLTSPD_CPD_IP       float64
KPI_ReachM_CPD_IP           float64
KPI_ReachRCRCEd_CPD_IP      float64
KPI_ReachS_CPD_IP           float64
KPI_ReachSI_CPD_IP          float64
KPI_ReachWASH_CPD_IP        float64
KPI_TrainFA_Tot_IP          float64
KPI_DonBlood_Tot_IP         float64
KPI_GB_Tot_IP               float64
KPI_PeopleVol_Tot_IP        float64
KPI_PStaff_Tot_IP           float64
KPI_noLocalUnits_IP         float64
KPI_IncomeLC_CHF_IP         float64
KPI_expenditureLC_CHF_IP    float64
dtype: object
In [ ]:
print(fdrs_data_fdrs_na_columns[kpi_code].dtypes)
KPI_Climate_CPD          float64
KPI_ClimateHeat_CPD      float64
KPI_ReachCTP_CPD         float64
KPI_ReachDRER_CPD        float64
KPI_ReachDRR_CPD         float64
KPI_ReachH_CPD           float64
KPI_ReachHI_CPD          float64
KPI_ReachHPM_CPD         float64
KPI_ReachL_CPD           float64
KPI_ReachLTSPD_CPD       float64
KPI_ReachM_CPD           float64
KPI_ReachRCRCEd_CPD      float64
KPI_ReachS_CPD           float64
KPI_ReachSI_CPD          float64
KPI_ReachWASH_CPD        float64
KPI_TrainFA_Tot          float64
KPI_DonBlood_Tot         float64
KPI_GB_Tot               float64
KPI_PeopleVol_Tot        float64
KPI_PStaff_Tot           float64
KPI_noLocalUnits         float64
KPI_IncomeLC_CHF         float64
KPI_expenditureLC_CHF    float64
dtype: object
In [ ]:
time_series_imputed.to_excel("C:/Users/simon.weiss/OneDrive - IFRC/Tech talk and files/1_PowerBI/1_FDRS/0.Data_Sources/1_FDRS_Wide_Format_IP_ALL_Staging.xlsx",index=False)

4. Post method - interact with Imp_Var? datapoint¶


2 methods:

  • Run first method if 3/ has not been generated.
  • Run second method if 3/ has been generated.

4.0 General configuration¶

In [ ]:
payload={}
files={}
headers = {}
#read private fdrs api key from config files
f = open(module_path+"/src/config/api_key.txt", "r")
api_key=f.readline()

[User Input] Select Years to imputed below

In [ ]:
years=[2021]

[User Input] Variables to impute

In [ ]:
#kpi_code=["KPI_PeopleVol_Tot","KPI_PStaff_Tot","KPI_DonBlood_Tot","KPI_TrainFA_Tot","KPI_noLocalUnits","KPI_ReachDRER_CPD","KPI_ReachLTSPD_CPD","KPI_ReachDRR_CPD","KPI_ReachS_CPD","KPI_ReachL_CPD","KPI_ReachH_CPD","KPI_ReachWASH_CPD","KPI_ReachM_CPD","KPI_ReachCTP_CPD","KPI_ReachSI_CPD","KPI_IncomeLC_CHF","KPI_expenditureLC_CHF"]
kpi_code=["KPI_PeopleVol_Tot"]

4.1 Post Method #1¶

In [ ]:
for kpi in list(kpi_code):
    for year in years:
        for kpi_don_code in list(set(fdrs_data_fdrs_na_columns["KPI_DON_code"])): 
            for label, content in (fdrs_data_fdrs_na_columns[kpi][(fdrs_data_fdrs_na_columns["KPI_Year"]==year)&(fdrs_data_fdrs_na_columns["KPI_DON_code"]==kpi_don_code)]).items():
                value = content
                if type(value)==float:
                    if not np.isnan(value):
                        value=np.int64(value)
                kpi_post=kpi+"_IP"
                url = (f"https://data-api-staging.ifrc.org/api/ImputedKPI?apiKey={api_key}&kpicode={kpi_post}&year={year}&don_code={kpi_don_code}&value={value}&user=simon.weiss@ifrc.org")
                r = requests.post(url)
                print(url,r.text)

4.2 Post methods #2 (only if 3.used)¶

In [ ]:
#fdrs_data_fdrs_na_columns_test.fillna(0,inplace=True)
for kpi in list(kpi_code):
    kpi_binary = kpi + "_binary"
    for year in years:
        for kpi_don_code in list(set(fdrs_data_fdrs_na_columns["KPI_DON_code"])): 
            for bi in (fdrs_data_fdrs_na_columns[kpi_binary][(fdrs_data_fdrs_na_columns["KPI_Year"]==year)&(fdrs_data_fdrs_na_columns["KPI_DON_code"]==kpi_don_code)]):
                binary = bi
                if binary == 1.0: 
                    for label, content in (fdrs_data_fdrs_na_columns[kpi][(fdrs_data_fdrs_na_columns["KPI_Year"]==year)&(fdrs_data_fdrs_na_columns["KPI_DON_code"]==kpi_don_code)]).items():
                        value = content
                        if type(value)==float:
                            if not np.isnan(value):
                                value=np.int64(value)
                                value = value
                        #if value == "nan":
                        if np.isnan(value):
                            value = ""
                        kpi_post=kpi+"_IP"
                        url = (f"https://data-api-staging.ifrc.org/api/ImputedKPI?apiKey={api_key}&kpicode={kpi_post}&year={year}&don_code={kpi_don_code}&value={value}&user=simon.weiss@ifrc.org")
                        r = requests.post(url,headers=headers, data=payload, files=files)
                        #print(url,r.text)
                        #print(r.request.body)
                        #print(r.request.headers)

                        

4.3 Final update lattest update time excel¶

In [ ]:
df1 = pd.DataFrame([datetime.now()], columns=['Lattest Update Date'])
df1.to_excel(module_path+"data/processed/lattest_update.xlsx", index=False)

Time to execute all : before optimization : 111min
After optimization : 2min