Sunday, November 8, 2015

Python: Pandas (3) Create data frame


Abstract: Create data frame using different approach. Insert or deletion column/row of a data frame


Three methods for creating a data frame are present here.
>>> print "\ncreate a data frame using a dictionary of Series objects:"
create a data frame using a dictionary of Series objects:
>>> df = pd.DataFrame(np.random.rand(3,4), columns=list("ABCD"), index=list("abc"))
>>> print df
A B C D
a 0.012619 0.410888 0.684326 0.303347
b 0.683163 0.300420 0.438949 0.076538
c 0.323952 0.483425 0.195125 0.764243
[3 rows x 4 columns]
>>>
>>> print '\ncreate a data frame using a dictionary of ndarrays:'
create a data frame using a dictionary of ndarrays:
>>> df=pd.DataFrame({'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]})
>>> df.index=['a','b','c']
>>> print df
A B C
a 1 4 7
b 2 5 8
c 3 6 9
[3 rows x 3 columns]
>>>
>>> print '\ncreate a data frame using a structured dictionary:'
create a data frame using a structured dictionary:
>>> data=np.zeros((3,), dtype=[('name','a15'),('age','a15'),('weight','f4')] )
>>> data[:]=[('a', 34,150),('b',23,170),('c',89,146)]
>>> df=pd.DataFrame(data)
>>> print df
name age weight
0 a 34 150
1 b 23 170
2 c 89 146
[3 rows x 3 columns]
>>>

Then, the function del() and pop() can delete a column:
>>> print df
A B C D
a 0.249442 0.802315 0.600084 0.364948
b 0.337858 0.914759 0.980284 0.179295
c 0.992502 0.287009 0.439516 0.971466

[3 rows x 4 columns]
>>> del df['A']
>>> print 'delete a column using del:', df
delete a column using del:
B C D
a 0.802315 0.600084 0.364948
b 0.914759 0.980284 0.179295
c 0.287009 0.439516 0.971466
[3 rows x 3 columns]
>>> df.pop('C')
>>> a 0.600084
b 0.980284
c 0.439516
Name: C, dtype: float64
>>> print 'delete a column using pop():', df
delete a column using pop():
B D
a 0.802315 0.364948
b 0.914759 0.179295
c 0.287009 0.971466
[3 rows x 2 columns]
>>> df=df.ix[:,1:]
>>> print 'delete the first column using index:', df
delete the first column using index: D
a 0.364948
b 0.179295
c 0.971466
[3 rows x 1 columns]
>>> print df
B D
a 0.399407 0.862440
b 0.940136 0.470817
c 0.074161 0.638882

[3 rows x 2 columns]

And the drop() can delete a row.
>>> df.drop('a')
B D
b 0.940136 0.470817
c 0.074161 0.638882

[2 rows x 2 columns]
>>> print 'delete a row using drop():', df
delete a row using drop(): df=df[1:]
B D
a 0.399407 0.862440
b 0.940136 0.470817
c 0.074161 0.638882

[3 rows x 2 columns]
>>> >>> print 'delete the first row:', df
delete the first row: B D
b 0.940136 0.470817
c 0.074161 0.638882

[2 rows x 2 columns]
>>>

And, the function of insert() can insert a column into a data frame.
>>> df = pd.DataFrame(np.random.rand(3,4), columns=list("ABCD"), index=list("abc"))
>>> df.insert(1, 'E', [245,7,2])
>>> print 'insert a column using insert():', df
insert a column using insert():
A E B C D
a 0.545596 245 0.511448 0.856543 0.829688
b 0.377887 7 0.769014 0.572489 0.301208
c 0.296155 2 0.407141 0.778476 0.819889
[3 rows x 5 columns]
>>>








No comments:

Post a Comment