Python: 4 ways to create Pandas DataFrame

Table of Contents

Table of Contents

The DataFrame of pandas is a very common object used in doing data analysis. This article introduces four common ways (NumPy, dist, list, csv file) to build a DataFrame.

Composition of DataFrame

The three main components are as follows.

  • values
  • columns
  • index

The default type of columns and index: RangeIndex.

DataFrame creation

Using NumPy to build DataFrame

  1. Do not set the values of columns, index.
import pandas as pd
import numpy as np

df_np1 = pd.DataFrame(np.arange(12).reshape(3, 4))
print(df_np1)
print(type(df_np1))

#    0  1   2   3
# 0  0  1   2   3
# 1  4  5   6   7
# 2  8  9  10  11
# <class 'pandas.core.indexes.range.RangeIndex'>
  1. Set the values of columns, index.
df_np2 = pd.DataFrame(np.arrange(12).reshape(3, 4),
    columns=['col_0', 'col_1', 'col_2'], 
    index=['index_0', 'index_1', 'index_2'])
print(df_np2)
print(type(df_np2))

#          col_0  col_1  col_2  col_3
# index_0      0      1      2      3
# index_1      4      5      6      7
# index_2      8      9     10     11
# <class 'pandas.core.indexes.base.Index'>

If you have set columns and index, the type of columns and index is Index.

Use dict to build DataFrame

df_dict = pd.DataFrame({
    'col_0': [0, 1, 2],
    'col_1': [3, 4, 5],
    'col_2': [6, 7, 8]
})
print(df_dict)

#    col_0  col_1  col_2
# 0      0      3      6
# 1      1      4      7
# 2      2      5      8

Create DataFrame with list

df_list = pd.DataFrame([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
print(df_list)

#    0  1  2
# 0  0  1  2
# 1  3  4  5
# 2  6  7  8

Create DataFrame using csv file

df_csv = pd.read_csv('xxx.csv', index_col=0)
print(df_csv)

#     name   age
# id            
# 1    'A'    30
# 2    'B'    31
# 3    'C'    32
# 4    'D'    33
# 5    'E'    34

Summary

  • DataFrame can be created by using NumPy, dist, list, csv file.