Analyzing the state of the Basic Health Units of Brazil (UBS) using Python for Data Science

This work was developed during the “Python IMD challenge” happened on 10/21/2017 with Igor, Ricardo, Luiza and me. The competition purpose was to develop a project involving Data Science during 5 hours. Our goal focused on choosing something impactful and at the same time simple to be developed in the short given time. We were very happy to know that we won the first position in the competition at the end! The prize is a free ticket to the national Python event that is going to happen next year.

eqp.jpg

Without further ado, let’s talk about the project itself!

During our searches for datasets about various topics, we found the national website which contains numerous pre-formatted data about national interests:   http://dados.gov.br/

The subject that called our attention was about the Basic Health Units of Brazil (Unidade básica de saúde), which are small public hospitals basically. The dataset had some interesting columns that we thought could bring an important conclusion, for example, the hospitals coordinates and their evaluation about different aspects like the hospital structure and medical supplies.

After acquiring the dataset, we followed with the necessary Python modules like the well-known Numpy and Panda, we also used the Folium module to generate amazing plots with the earth map, recurring the bubble map generation:


import numpy as np
import pandas as pd
import folium

We downloaded the UBS data at the link: http://dados.gov.br/dataset/unidades-basicas-de-saude-ubs with the desired format (“.csv”) and loaded it into a variable “ubs”:


ubs = pd.read_csv('ubs.csv')

After a group discussion, it was decided to remove some irrelevant information to our analysis like the phone number of the UBSs and the state code, since ( in the case of the state code) we had the map coordinates:


del ubs['cod_cnes']
del ubs['cod_munic']
del ubs['nom_estab']
del ubs['dsc_telefone']
del ubs['dsc_endereco']

Since we decided to base our research on the evaluation of the hospitals, it was observed that the evaluations only had three classes, which could be read as bad, medium or good. Those evaluations were stored on a list to be compared further:


clases = ['Desempenho mediano ou  um pouco abaixo da média','Desempenho acima da média','Desempenho muito acima da média']

We plotted the bubble map representing the evaluations of each UBS plotted, red dots represent bad evaluations, orange dots for medium evaluations and green dots for good ones. The following code plots the data corresponding to the structure evaluation of the USBs. It was chosen to plot a reduced number of points to minimize the computational effort.


m = folium.Map(location=[20,0], tiles="Mapbox Bright", zoom_start=2)
for i in range(0,brasil_epoch):
    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[0]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='red',
        fill=True,
        fill_color='crimson'
        ).add_to(m)

    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[1]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='orange',
        fill=True,
        fill_color='crimson'
        ).add_to(m)

    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[2]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='green',
        fill=True,
        fill_color='crimson'
        ).add_to(m)
m.save('mapa_estrutura_brasil.html')

plot_brazil

It was also suggested to plot data about our home city (Natal). The following code generates a visualization of the UBS structures evaluations of Natal.


ubs2 = copy.copy(ubs[ubs['dsc_cidade']=='Natal'])

m = folium.Map(location=[20,0], tiles="Mapbox Bright", zoom_start=2)
for j in ubs2.iterrows():
    i = j[0]
    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[0]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='red',
        fill=True,
        fill_color='crimson'
        ).add_to(m)

    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[1]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='orange',
        fill=True,
        fill_color='crimson'
        ).add_to(m)

    if ubs['dsc_estrut_fisic_ambiencia'][i]==clases[2]:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='green',
        fill=True,
        fill_color='crimson'
        ).add_to(m)
m.save('mapa_estrutura_natal.html')

plot_natal

After various plots, we decided to create a coefficient to relate all the different evaluations together with the median of the classes. We mapped the values of the classes to a range from 0 to 10, bad is 3.3, medium is 6.7 and 10 is good. Since the values were discrete, we also plotted a histogram to show the repetition of those calculated values to all the analised UBS.


a = 3.3
b = 6.7
c = 10
ubs_natal = ubs[ubs['dsc_cidade']=='Natal']
ubs_natal['dsc_estrut_fisic_ambiencia'][ubs_natal['dsc_estrut_fisic_ambiencia'] == 'Desempenho mediano ou um pouco abaixo da média'] = a
ubs_natal['dsc_estrut_fisic_ambiencia'][ubs_natal['dsc_estrut_fisic_ambiencia'] == 'Desempenho acima da média'] = b
ubs_natal['dsc_estrut_fisic_ambiencia'][ubs_natal['dsc_estrut_fisic_ambiencia'] == 'Desempenho muito acima da média'] = c
ubs_natal['dsc_adap_defic_fisic_idosos'][ubs_natal['dsc_adap_defic_fisic_idosos'] == 'Desempenho mediano ou um pouco abaixo da média'] = a
ubs_natal['dsc_adap_defic_fisic_idosos'][ubs_natal['dsc_adap_defic_fisic_idosos'] == 'Desempenho acima da média'] = b
ubs_natal['dsc_adap_defic_fisic_idosos'][ubs_natal['dsc_adap_defic_fisic_idosos'] == 'Desempenho muito acima da média'] = c
ubs_natal['dsc_equipamentos'][ubs_natal['dsc_equipamentos'] == 'Desempenho mediano ou um pouco abaixo da média'] = a
ubs_natal['dsc_equipamentos'][ubs_natal['dsc_equipamentos'] == 'Desempenho acima da média'] = b
ubs_natal['dsc_equipamentos'][ubs_natal['dsc_equipamentos'] == 'Desempenho muito acima da média'] = c
ubs_natal['dsc_medicamentos'][ubs_natal['dsc_medicamentos'] == 'Desempenho mediano ou um pouco abaixo da média'] = a
ubs_natal['dsc_medicamentos'][ubs_natal['dsc_medicamentos'] == 'Desempenho acima da média'] = b
ubs_natal['dsc_medicamentos'][ubs_natal['dsc_medicamentos'] == 'Desempenho muito acima da média'] = c
ubs_natal['dsc_estrut_fisic_ambiencia'].hist()

ubs_natal['media'] = (ubs_natal[:]['dsc_estrut_fisic_ambiencia'] + ubs_natal[:]['dsc_adap_defic_fisic_idosos'] + ubs_natal[:]['dsc_equipamentos'] + ubs_natal[:]['dsc_medicamentos'])/4

m = folium.Map(location=[20,0], tiles="Mapbox Bright", zoom_start=2)
for j in ubs_natal.iterrows():
    i = j[0]
    if ubs_natal['media'][i]<=4:         folium.Circle(         location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],         popup="clan",         radius=3,         color='red',         fill=True,         fill_color='crimson'         ).add_to(m)          if ubs_natal['media'][i]>4 and ubs_natal['media'][i]<=8 :         folium.Circle(         location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],         popup="clan",         radius=3,         color='orange',         fill=True,         fill_color='crimson'         ).add_to(m)              if ubs_natal['media'][i]>8:
        folium.Circle(
        location=[ubs['vlr_latitude'][i],ubs['vlr_longitude'][i]],
        popup="clan",
        radius=3,
        color='green',
        fill=True,
        fill_color='crimson'
        ).add_to(m)
m.save('mapa_notas_natal.html')<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>

notas_brazil.png

hist_brasil

The above map and histogram portray the UBS situation around Brazil with the calculated coefficient based on 4 different evaluation parameters, the grade range is 0 to 10. It is possible to see how this Health Unity is deprecated if we take the standard of the evaluations given in the dataset. Also it visible a little tendency to have better evaluated UBSs around the south-east of the country. The processed information can be used to show government institutes the situation of the UBSs and help then decide improvements in the system.

notas_natal

hist_natal

The same standards were used to plot the above graphs to our hometown.

At the end we also calculated the median value of all the 4 different parameters and their standard deviation:


ubs_brasil_mean = pd.Series([ubs['dsc_estrut_fisic_ambiencia'].mean(), ubs['dsc_adap_defic_fisic_idosos'].mean(), ubs['dsc_medicamentos'].mean(), ubs['dsc_equipamentos'].mean()], index=['dsc_estrut_fisic_ambiencia', 'dsc_adap_defic_fisic_idosos', 'dsc_medicamentos', 'dsc_equipamentos'])
ubs_brasil_std = pd.Series([ubs['dsc_estrut_fisic_ambiencia'].std(), ubs['dsc_adap_defic_fisic_idosos'].std(), ubs['dsc_medicamentos'].std(), ubs['dsc_equipamentos'].std()], index=['dsc_estrut_fisic_ambiencia', 'dsc_adap_defic_fisic_idosos', 'dsc_medicamentos', 'dsc_equipamentos'])
print 'Physical Structure of UBS: ',ubs_brasil_mean[0],'+/-',ubs_brasil_std[0]
print 'Accessibility of UBS: ',ubs_brasil_mean[1],'+/-',ubs_brasil_std[1]
print 'Medical supplies of UBS: ',ubs_brasil_mean[2],'+/-',ubs_brasil_std[2]
print 'Equipment quality of UBS: ',ubs_brasil_mean[3],'+/-',ubs_brasil_std[3]

The source files can be found in:

https://github.com/YangTavares/Data_Science/tree/master/yang_desafio_imd

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s