pandas 틀린부분 복기 5 apply, value

# 문제1
# Calculate for all numerical columns the range between highest and lowest value! Fill in the gaps! The range for mpg is...?

# apply 가 생각이 안났다.
# 어차피 답안을 보니깐 틀린듯 하다. 왜 이런 결과가 나왔을까?
# 차이가 너무 심하게 난다.
# 행 전부에 열은 model_year까지 가져오는 것이고..
# axis가 행을 기준으로 해서 그런건가?
# 행을 기준으로하면 확실히 최고값인 weight에서 최소값인 cylinders를 빼면 말이 된다.
cars.iloc[:, :-2].apply(lambda x: x.max() - x.min(), axis = 1)

cars.iloc[:,:-2].apply(lambda x: x.max() - x.min(), axis = 0)

'''
mpg               37.6
cylinders          5.0
displacement     387.0
horsepower       184.0
weight          3527.0
acceleration      16.8
model_year        12.0
dtype: float64
'''

# 열을 기준으로 (axis = 0)해주었더니 이젠 말이 된다.

# 문제2
# Swap the levels of the MultiIndex and then sort the new MultiIndex in ascending order! Reassign cars! Fill in the gaps!

# swaplevel()에는 inplace = True 기능이 없음에도 그걸 잊고
# cars = cars.swaplevel.sort_index(ascending = True)
# 라고 입력하여서 어트리뷰트 에러가 났었다.
cars = cars.swaplevel().sort_index(ascending = True)

# 문제3
# Let´s doublecheck the result from above. Check how many car names in the column "names" contain "ford"!

# True 와 False를 출력하는건 알겠지만
# True의 갯수를 세는건 모르겠다.
split[0].str.contains('ford')
'''
0      False
1      False
2      False
3      False
4      False
       ...  
393    False
394     True
395    False
396     True
397    False
Name: 0, Length: 398, dtype: bool
'''

# 해답은 의외로 간단했다.

split[0].value_counts()

'''
ford             51
chevrolet        47
plymouth         31
dodge            28
amc              28
toyota           26
datsun           23
vw               22
buick            17
pontiac          16
honda            13
mercury          12
mazda            12
oldsmobile       10
peugeot           8
fiat              8
audi              7
chrysler          6
volvo             6
renault           5
opel              4
subaru            4
saab              4
mercedes-benz     3
cadillac          2
bmw               2
ih                1
nissan            1
triumph           1
Name: 0, dtype: int64
'''

# 이렇게 주욱 구한 다음에, 내가 찾는 ford 값을 찾아줘도 되지만

split[0].str.contains('ford').value_counts()

'''
False    347
True      51
Name: 0, dtype: int64
'''

# 이렇게 보면 단순히 ford 라는 값의 True가 51개
# 즉, ford는 51개 임을 알 수가 있다.

'개발일지 > Pandas' 카테고리의 다른 글

pandas 틀린부분 복기 6 matplotlib (0)	2022.07.26
pandas 판다스 기초 12 Matplotlib (0)	2022.07.26
pandas 판다스 기초11 slice, upper, lower, title등 Series의 경우에 작동하는 문법 (0)	2022.07.25
pandas 틀린부분 복기 4 (0)	2022.07.23
pandas 판다스 rank, unique, nunique, count, 평균, 표준편차(mean, std), 상관계수 corr (0)	2022.07.23

개발튜토리얼

pandas 틀린부분 복기 5 apply, value_counts

'개발일지 > Pandas' 카테고리의 다른 글

티스토리툴바

pandas 틀린부분 복기 5 apply, value_counts

'개발일지 > Pandas' 카테고리의 다른 글

관련글

티스토리툴바