<aside> 📌 Task
세가지 데이터 테이블의 정보를 확인하고 디바이스&ux 환경 위주로 풀어내보기!
</aside>
<aside> 📌 실행 및 진행 사항 정리
book_reading_status = pd.read_csv('book_reading_status.csv')
user_demographics = pd.read_csv('user_demographics.csv')
user_activity = pd.read_csv('user_activity.csv')
#데이터 병합
merged = pd.merge(book_reading_status, user_demographics, on='user_id', how='left')
merged = pd.merge(merged, user_activity, on='user_id', how='left')
#컬럼 확인
merged.info()
#결측치 확인
merged.isnull().sum().sort_values(ascending=False)
#이상치 확인
merged['exit_position_numeric'].describe()
print(merged.head(10))
</aside>
<aside> 📌 결과
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 user_id 1000 non-null object
1 book_id 1000 non-null object
2 genre 1000 non-null object
3 exit_position_numeric 1000 non-null int64
4 dropout_reason_category 1000 non-null object
5 dropout_reason_detail 650 non-null object
6 gender 1000 non-null object
7 birthday 990 non-null object
8 device_type 1000 non-null object
9 subscription_plan 1000 non-null object
10 theme_mode 1000 non-null object
11 Unnamed: 0 1000 non-null int64
12 entry_channel 1000 non-null object
13 quick_preview_used 1000 non-null object
14 recommendation_clicked 1000 non-null bool
15 last_access_timestamp 1000 non-null object
dtypes: bool(1), int64(2), object(13)
memory usage: 118.3+ KB
user_id book_id genre exit_position_numeric dropout_reason_category \\
0 user_0001 E01 경제/시사 68 자발적
1 user_0002 E01 경제/시사 44 자발적
2 user_0003 E01 경제/시사 59 자발적
3 user_0004 N05 소설 97 UX 불편
4 user_0005 S02 자기계발 51 UX 불편
5 user_0006 E02 경제/시사 40 자발적
6 user_0007 E01 경제/시사 38 UX 불편
7 user_0008 N05 소설 22 자발적
8 user_0009 W02 웹툰 28 UX 불편
9 user_0010 S03 자기계발 62 UX 불편
dropout_reason_detail gender birthday device_type subscription_plan \\
0 지루함 male 1998-06-17 mobile monthly
1 추천 실패 male 2010-04-15 mobile monthly
2 너무 김 male 1985-02-13 mobile monthly
3 NaN female 1974-11-21 eReader monthly
4 NaN male 1970-11-01 mobile free_trial
5 추천 실패 female 1959-12-02 mobile monthly
6 NaN female 1974-06-07 tablet monthly
7 추천 실패 female 1973-11-24 mobile pay_per_book
8 NaN female 1966-05-09 mobile free_trial
9 NaN female 1994-03-23 mobile free_trial
theme_mode Unnamed: 0 entry_channel quick_preview_used \\
0 dark 0 추천 No
1 dark 1 추천 No
2 dark 2 추천 No
3 light 3 추천 Yes
4 dark 4 검색 Yes
5 custom 5 홈메인배너 Yes
6 dark 6 추천 Yes
7 dark 7 추천 Yes
8 light 8 홈메인배너 Yes
9 light 9 추천 No
recommendation_clicked last_access_timestamp
0 True 2023-05-08 14:13:00
1 True 2023-01-13 00:54:00
2 True 2023-07-12 09:13:03
3 True 2023-11-20 21:24:55
4 False 2023-11-01 05:55:04
5 True 2023-07-26 18:18:02
6 True 2023-12-19 03:44:32
7 True 2023-08-17 19:08:34
8 False 2023-08-23 18:37:26
9 True 2023-08-29 19:33:08
</aside>