Python

I. testData.csv에 적용¶ In [1]: import warnings warnings.filterwarnings('ignore') pandas & matplotlib Packages In [2]: import pandas as pd import seaborn as sns import matplotlib.pyplot as plt Read testData.csv In [3]: url = 'https://raw.githubusercontent.com/rusita-ai/pyData/master/testData.csv' DATA = pd.read_csv(url) testData.csv Information In [4]: DATA.info() RangeIndex: 5000 entries, 0 to 499..
Regression Analysis¶ In [1]: import warnings warnings.filterwarnings('ignore') I. 단일회귀분석¶ 1) Load Data¶ In [2]: import pandas as pd url = 'https://raw.githubusercontent.com/rusita-ai/pyData/master/Galton.txt' DF = pd.read_table(url) DF.info() RangeIndex: 898 entries, 0 to 897 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Family 898 non-null objec..
Correlation Analysis¶ In [ ]: import warnings warnings.filterwarnings('ignore') Load Data¶ 키, 몸무게 데이터 In [2]: import pandas as pd url = 'https://raw.githubusercontent.com/rusita-ai/pyData/master/PII.csv' DF = pd.read_csv(url) DF.info() RangeIndex: 17 entries, 0 to 16 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Name 17 non-null object 1 Gender 1..
Hypothesis Test¶ In [ ]: import warnings warnings.filterwarnings('ignore') I. t-Test¶ 일반적으로 모집단의 모평균이나 모분산은 알 수 없음 표본표준편차로 표본집단을 표준화를 하면, 표준정규분포가 아닌 t-분포를 따름 표본표준편차로 변환한 t-분포를 사용하기 때문에 t-Test라고 함 1) 단일집단 t-Test¶ (1) Hypothesis¶ 귀무가설 : 스낵의 평균무게는 50이다. 대립가설 : 스낵의 평균무게는 50이 아니다. (2) 단일표본¶ In [2]: import pandas as pd url = 'https://raw.githubusercontent.com/rusita-ai/pyData/master/foodWeight.csv' DF..
Descriptive Statistics¶ In [ ]: import warnings warnings.filterwarnings('ignore') Load 'tips.csv' Data¶ In [ ]: import seaborn as sns DF = sns.load_dataset('tips') I. pandas¶ 1) DataFrame Information¶ In [ ]: DF.info() RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 total_bill 244 non-null float64 1 tip 244 non-nul..
seaborn¶ https://seaborn.pydata.org/api.html In [ ]: import warnings warnings.filterwarnings('ignore') 실습파일 구성¶ PII.csv pandas Package In [2]: import pandas as pd .read_csv( ) In [3]: url = 'https://raw.githubusercontent.com/rusita-ai/pyData/master/PII.csv' DF = pd.read_csv(url) DF.head() Out[3]: Name Gender Age Grade Picture BloodType Height Weight 0 송태섭 남자 21 3 무 B 179.1 63.9 1 최유정 여자 23 1 유 A..
Data Preprocessing¶ I. Missing Value II. Filtering III. 데이터프레임 합치기 IV. 그룹 연산 V. pivot_table( ) VI. Multi-Index VII. etc I. Missing Value¶ 1) 실습용 'titanic' 데이터셋¶ In [1]: import seaborn as sns TD = sns.load_dataset('titanic') 'titanic' Dataset Information Seaborn에는 Pandas 기능 상속돼있음 In [2]: TD.info() RangeIndex: 891 entries, 0 to 890 Data columns (total 15 columns): # Column Non-Null Count Dtype ---..
simds
'Python' 태그의 글 목록