딥러닝 초보자 입니다. 아래와 같은 데이터를 가지고 선형회귀 분석을 수행하려 합니다. 그런데 특성 행렬을 어떻게 만들어야 하는지 몰라서 글을 올립니다.
date,Unemployment Rate,id,title,state
1976-01-01,7.6,LAUST023A,Unemployment Rate in Alaska,Alaska
1977-01-01,9.8,LAUST023A,Unemployment Rate in Alaska,Alaska
1978-01-01,10.7,LAUST023A,Unemployment Rate in Alaska,Alaska
1976-01-01,6.7,LAUST013A,Unemployment Rate in Alabama,Alabama
1977-01-01,7.1,LAUST013A,Unemployment Rate in Alabama ,Alabama
1978-01-01,6.4,LAUST013A,Unemployment Rate in Alabama,Alabama
1976-01-01,6.9,LAUST053A,Unemployment Rate in Arkansas,Arkansas
1977-01-01,6.5,LAUST053A,Unemployment Rate in Arkansas ,Arkansas
1978-01-01,6.2,LAUST053A,Unemployment Rate in Arkansas ,Arkansas
=====================================================
date,Percent of People in Poverty,id,title,state
1976-01-01,10.6,PPAAAK02,Percent of People in Poverty for Alaska,Alaska
1977-01-01,11.3,PPAAAK02,Percent of People in Poverty for in Alaska,Alaska
1978-01-01,7.9,PPAAAK02,Percent of People in Poverty for in Alaska,Alaska
1976-01-01,8.3,PPAAAL01,Percent of People in Poverty for Alabama,Alabama
1977-01-01,11.4,PPAAAL01,Percent of People in Poverty for Alabama,Alabama
1978-01-01,7.9,PPAAAL01,Percent of People in Poverty for Alabama,Alabama
1976-01-01,6.9,PPAAAR05,Percent of People in Poverty for Arkansas,Arkansas
1977-01-01,6.5,PPAAAR05,Percent of People in Poverty for Arkansas,Arkansas
1978-01-01,6.2,PPAAAR05,Percent of People in Poverty for Arkansas,Arkansas
=====================================================
date,Real Gross Domestic Product,id,title,state
1976-01-01,42211.3 ,ARRGSP,Real Gross Domestic Product in Alaska,Alaska
1977-01-01,41095.9 ,ARRGSP,Real Gross Domestic Product in Alaska,Alaska
1978-01-01,42355.3 ,ARRGSP,Real Gross Domestic Product in Alaska,Alaska
1976-01-01,144501.2,AKRGSP,Real Gross Domestic Product in Alabama,Alabama
1977-01-01,176625,AKRGSP,Real Gross Domestic Product in Alabama,Alabama
1978-01-01,53336.5,AKRGSP,Real Gross Domestic Product in Alabama,Alabama
1976-01-01,107932.5,ALRGSP,Real Gross Domestic Product in Arkansas,Arkansas
1977-01-01,89871.7 ,ALRGSP,Real Gross Domestic Product in Arkansas,Arkansas
1978-01-01,200800.9,ALRGSP,Real Gross Domestic Product in Arkansas,Arkansas
데이터를 간단히 설명하면 첫번째 컬럼은 시간대이고 두번째 컬럼은 데이터를 가지는 속성입니다. 여기서는 [Unemployment Rate, Percent of People in Poverty, Gross Domestic Product] 입니다. id는 데이터에 대한 고유한 값입니다. 각 state에 대한 속성값을 호출할 때 사용하는 id 입니다. 마지막 컬럼은 데이터를 가지는 state입니다. 말씀드렸듯이 Unemployment Rate 값에 대한 선형회귀를 테스트하고 싶은데 feature를 어떻게 만들어야 할지 이해가 부족해 글을 올립니다. 파이썬 pandas로 dataframe을 만들려 하는데 인덱스로 date와 state를 사용하는건 어떨지요? 특성 컬럼을 어떤 형태로 만들어야 할지 조언 부탁드립니다.