[Pandas] 정보 탐색 및 정렬

1. 데이터 프레임 정보탐색

- read_fileType(route) : 파일 읽기, 특정 부분만 읽기 가능

CCTV_Seoul = pd.read_csv("../data/01. Seoul_CCTV.csv", encoding="utf-8")
pop_Seoul = pd.read_excel("../data/01. Seoul_Population.xls")

pop_Seoul = pd.read_excel(
    "../data/01. Seoul_Population.xls", header=2, usecols="B,D,G,J,N"
)

- rename() : 이름 변경 가능, inplace=True 시, 변경 결과가 저장

pop_Seoul.rename(
    columns={
        pop_Seoul.columns[0]:"구별",
        pop_Seoul.columns[1]:"인구수",
        pop_Seoul.columns[2]:"한국인",
        pop_Seoul.columns[3]:"외국인",
        pop_Seoul.columns[4]:"고령자",
    },
    inplace=True,
)

- head() : 기본 상위 5개 출력, ()안에 수만큼 출력

CCTV_Seoul.head()

- tail(): 기본 하위 5개 출력

CCTV_Seoul.tail()

- index : 인덱스 출력

CCTV_Seoul.index

- columns : 컬럼 출력

CCTV_Seoul.index

- values : value 출력

CCTV_Seoul.values

- info() : 데이터 프레임의 기본 정보 확인

CCTV_Seoul.info()

- describe() : 데이터 프레임의 기술 통계 정보 확인

CCTV_Seoul.describe()

2. 데이터 정렬

- sort_values() : 특정 컬럼(열)의 기준으로 데이터를 정렬, ascending, inplace 사용 가능

ex) pop_Seoul.sort_values(["인구수"], ascending=False).head()

ex) df.sort_values(by="B", ascending = False, inplace = True)

CCTV_Seoul.sort_values(by="소계", ascending=True).head(5)
CCTV_Seoul.sort_values(by="소계", ascending=False).head(5)

- ["column"] : 특정 컬럼 선택

CCTV_Seoul["구별"]                      #한 개 컬럼 선택
CCTV_Seoul[["구별", "소계"]].head()       #두 개 컬럼 선택

'Python > Python 데이터분석' 카테고리의 다른 글

[Pandas] 데이터 병합 (0)	2022.05.25
[Pandas] 함수 사용 (0)	2022.05.25
[Pandas] 컬럼 추가 및 제거 (0)	2022.05.25
[Pandas] Offset index와 Condition (0)	2022.05.25
[Pandas] Pandas란? (0)	2022.05.25

1. 데이터 프레임 정보탐색

2. 데이터 정렬

'Python > Python 데이터분석' 카테고리의 다른 글

티스토리툴바