Akshay Jain‍

← Back to all posts

Mapping external data to Django model attributes

Mar ‘18

Student Data Form

This post shows how to map custom pandas data frame or CSV/Excel columns to Django model attributes.

This is a very specific problem that I faced as part of a group project for a student analytics platform. I couldn’t find any direct resources to handle this and so this is my attempt to help out anyone else with this same problem. It might be easier to go through the entire code at the bottom of the page to get a better sense of the problem.

I needed to map columns from a CSV or Excel file to attributes in our Django model. The specific problem was to allow a school admin to easily enter student data via an Excel/CSV file upload. However, different schools may have different names for such attributes and there is a need for mapping.

For the complete Django view check the bottom of the page.

Firstly let’s obtain a list of the actual model attributes in our view:

path_name = request.GET.get('df')
df = pd.read_csv(path_name)
names = list(df.columns)
fields = [field.name for field in [MODEL_NAME]._meta.get_fields()]
return render(request, 'fieldmatching.html', {'fields' : fields, 'path_name': path_name, 'names' : names})
view raw 1.py hosted with ❤ by GitHub

NOTE: For simplicity convert your Excel files to CSV like so:

df = pd.read_excel('file')
df.to_csv('path_name', index=False)
view raw 2.py hosted with ❤ by GitHub

Next in our fieldmatching.html we can list out our actual field names like so:

{% for field in fields %}
<option value={{field}}>{{field}}</option>
{% endfor %}
view raw 3.py hosted with ❤ by GitHub

We need to allow the user to select one of these options that matches the of of the “names” to one of the actual “fields”.

{% for name in names %}
{{name}}:
<select name="{{name}}">
{% for field in fields %}
<option value={{field}}>{{field}}</option>
{% endfor %}
</select>
{% endfor %}
<input type="hidden" name="path_name" value={{path_name}}>
view raw 4.py hosted with ❤ by GitHub

This should display the ‘name’ of each column from the CSV file and a set of the model attributes as options. The idea is for the user to select the attribute which most closely matches the user’s uploaded file’s column.

The hidden input allows the file’s path to be transferred to the fieldmatching view when the POST request is made.

Going back to our view, we need to change our CSV file to have the same column names as our actual field attribute names. We can do this with a post request.

if request.method == 'POST':
path_name = request.POST['csv_path']
df = pd.read_csv(path_name)
names = list(df.columns)
view raw 5.py hosted with ❤ by GitHub

Setup the path to the CSV file, create a pandas data frame and load the column list as “names” as above.

matched = { key:request.POST.get(key, False) for key in names }
df.rename(columns = matched, inplace = True)
view raw 6.py hosted with ❤ by GitHub

Now we create a dictionary called “matched”. request.POST.get() gets the attribute name corresponding to each column name. Using pandas rename() function, we rename the old columns (‘names’) with the new columns (‘fields’).

Now time to fill in the model data by converting the data frame to a dictionary and iterating:

df.set_index("index_column", drop=True, inplace=True)
dictionary = df.to_dict(orient="index")
for index, object in dictionary.items():
model = MODEL_NAME()
for key,value in object.items():
setattr(model, key, value)
setattr(m, 'index', index)
model.save()
view raw 7.py hosted with ❤ by GitHub

This should allow you to create objects with custom CSV/Excel files.

Your view should look like:

def fieldmatching(request):
if request.method == 'POST':
path_name = request.POST['path_name']
df = pd.read_csv(path_name)
names = list(df.columns)
if request.POST.get('checkBox') == None:
###To keep the same columns in case of matching 'fields' and ###'names', add a checkbox on the html page
matched = { key:request.POST.get(key, False) for key in names }
df.rename(columns = matched, inplace = True)
df.set_index("index", drop=True, inplace=True)
dictionary = df.to_dict(orient="index")
for index, object in dictionary.items():
model = MODEL_NAME()
for key,value in object.items():
setattr(model, key, value)
setattr(m, 'index', index)
model.save()
return redirect('redirect_to_view')
else:
path_name = request.GET.get('df')
df = pd.read_csv(path_name)
names = list(df.columns)
fields = [field.name for field in MODEL_NAME._meta.get_fields()]
return render(request, 'fieldmatching.html', {'fields' : fields, 'path_name': path_name, 'names' : names})
view raw 8.py hosted with ❤ by GitHub

The fieldmatching.html should look like:

<input id="option" type="checkbox" value="checked" name="checkBox">
{% for name in names %}
{{name}}:
<select name="{{name}}">
{% for field in fields %}
<option value={{field}}>{{field}}</option>
{% endfor %}
</select>
{% endfor %}
<input type="hidden" name="path_name" value={{path_name}}>
view raw 9.py hosted with ❤ by GitHub

urls.py should have the following path added:

path('fieldmatching',views.fieldmatching, name='fieldmatching'),
view raw 10.py hosted with ❤ by GitHub

And that’s it! Feel free to let me know how I can improve my post and let me know if you run into any problem!