The core engine for project $\large\hat{y}$.
The dataset is from New York City Airbnb Open Data on kaggle
DATA = Path("../data")
CSV_PATH = DATA/"AB_NYC_2019.csv"
The AirBnB New York 2019 dataset
df = pd.read_csv(CSV_PATH)
df.sample(10)
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 17324 | 13676601 | Here's a great offer on a spacious furnished r... | 79598330 | Nesha | Brooklyn | East Flatbush | 40.66337 | -73.92576 | Private room | 30 | 3 | 20 | 2018-03-31 | 0.54 | 1 | 27 |
| 20494 | 16323239 | Spacious 2 BDR - Hell's Kitchen/Times Square | 2559886 | Sy | Manhattan | Hell's Kitchen | 40.76284 | -73.98849 | Entire home/apt | 130 | 4 | 2 | 2017-11-30 | 0.07 | 1 | 0 |
| 27546 | 21679639 | Entire 1-Bedroom Greenpoint Apartment | 4696622 | James | Brooklyn | Greenpoint | 40.72747 | -73.95462 | Entire home/apt | 95 | 3 | 0 | NaN | NaN | 1 | 0 |
| 45469 | 34776151 | Bedroom + den + bath w/ sep. entry in Bed Stuy! | 73612539 | Rebecca | Brooklyn | Bedford-Stuyvesant | 40.68602 | -73.94844 | Private room | 68 | 1 | 8 | 2019-07-02 | 6.32 | 2 | 36 |
| 33070 | 26085075 | Charming studio with PRIVATE DECK by McCarren ... | 27530449 | Estefania | Brooklyn | Greenpoint | 40.72173 | -73.94820 | Private room | 106 | 4 | 21 | 2019-06-03 | 1.73 | 2 | 58 |
| 19553 | 15634892 | Adorable, NYC studio for the holiday! | 15353668 | Bria | Manhattan | Midtown | 40.75228 | -73.97186 | Entire home/apt | 144 | 28 | 0 | NaN | NaN | 1 | 90 |
| 39825 | 30954420 | Artistic apartment in the Heart of Manhattan | 231298987 | Austin | Manhattan | Lower East Side | 40.71890 | -73.98599 | Entire home/apt | 200 | 4 | 1 | 2019-03-16 | 0.26 | 1 | 0 |
| 10897 | 8407092 | Historic Ridgewood Brick Townhouse | 9684993 | Randy | Queens | Ridgewood | 40.70928 | -73.89795 | Entire home/apt | 139 | 5 | 3 | 2018-07-30 | 0.16 | 1 | 0 |
| 10715 | 8247721 | Charming Crown Heights Brownstone | 43494916 | Tilly | Brooklyn | Crown Heights | 40.67791 | -73.95337 | Entire home/apt | 80 | 3 | 0 | NaN | NaN | 2 | 0 |
| 5187 | 3743048 | *WARM*Beautiful*Room*ST. GEORGE steps to ferry! | 19143974 | Meghan | Staten Island | St. George | 40.64408 | -74.07834 | Private room | 58 | 3 | 93 | 2019-05-06 | 2.10 | 1 | 279 |
Config how we learn the columns¶
This is a python/console interface, that will
- guide the user through columns one by one,
- let user decide how should we treat a column during the learning
md5hash[source]
md5hash(x)
X input modules¶
class ModelInput[source]
ModelInput(rich_col) ::Module
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:to, etc.
class InputEmb[source]
InputEmb(rich_col) ::ModelInput
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:to, etc.
class InputOneHot[source]
InputOneHot(rich_col) ::ModelInput
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:to, etc.
class InputConti[source]
InputConti(rich_col) ::ModelInput
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:to, etc.
Y target encode¶
class YEncoder[source]
YEncoder(rich_col)
Encode the why into the required shape input of the call, numpy array
Enhanced columns¶
class RichColumn[source]
RichColumn(column,is_y=False,min_occur=5,is_emb=True,hidden_size=20)
A pandas series manager
rdf = RichDF(df,fname = "testing_case_nyc")
Use tour() to set the configuration¶
rdf.tour()
Here's how I set the columns
rdf.set_y("price")
This is how I set the configuration:
for col in rdf.df:
print(">"*5,col,"<"*5)
print(rdf.t[md5hash(col)])
>>>>> id <<<<<
{'name': 'id', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': False}
>>>>> name <<<<<
{'name': 'name', 'defined': True, 'is_conti': False, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> host_id <<<<<
{'name': 'host_id', 'defined': True, 'is_conti': False, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> host_name <<<<<
{'name': 'host_name', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': False}
>>>>> neighbourhood_group <<<<<
{'name': 'neighbourhood_group', 'defined': True, 'is_conti': False, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> neighbourhood <<<<<
{'name': 'neighbourhood', 'defined': True, 'is_conti': False, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> latitude <<<<<
{'name': 'latitude', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> longitude <<<<<
{'name': 'longitude', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> room_type <<<<<
{'name': 'room_type', 'defined': True, 'is_conti': False, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> price <<<<<
{'name': 'price', 'defined': True, 'is_conti': True, 'is_y': True, 'is_emb': False, 'use': True}
>>>>> minimum_nights <<<<<
{'name': 'minimum_nights', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> number_of_reviews <<<<<
{'name': 'number_of_reviews', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> last_review <<<<<
{'name': 'last_review', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': False}
>>>>> reviews_per_month <<<<<
{'name': 'reviews_per_month', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> calculated_host_listings_count <<<<<
{'name': 'calculated_host_listings_count', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
>>>>> availability_365 <<<<<
{'name': 'availability_365', 'defined': True, 'is_conti': True, 'is_y': False, 'is_emb': True, 'use': True}
list(rdf.Xs)
[<Rich Column:name>, <Rich Column:host_id>, <Rich Column:neighbourhood_group>, <Rich Column:neighbourhood>, <Rich Column:latitude>, <Rich Column:longitude>, <Rich Column:room_type>, <Rich Column:minimum_nights>, <Rich Column:number_of_reviews>, <Rich Column:reviews_per_month>, <Rich Column:calculated_host_listings_count>, <Rich Column:availability_365>]
rdf["room_type"].encode("Entire home/apt")
1
class TabularModel[source]
TabularModel(rdf) ::Module
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their
parameters converted too when you call :meth:to, etc.
class TabularNN[source]
TabularNN(rich_df,batch_size=128)
rdf["room_type"].col.rc
<Rich Column:room_type>
tnn = TabularNN(rdf)
coldf = tnn.x[1].top_freq["index"]
next(tnn.batch_df())
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 36444734 | Sunny Upper West Side Apt. 3mins from Central ... | 32987938 | Alex | Manhattan | Upper West Side | 40.77889 | -73.97668 | Entire home/apt | 143 | 3 | 0 | NaN | NaN | 1 | 10 |
| 1 | 16987293 | New, Luxury and Sunny Apartment | 57455831 | Maggie | Brooklyn | Clinton Hill | 40.69440 | -73.96606 | Entire home/apt | 150 | 1 | 1 | 2017-05-06 | 0.04 | 2 | 0 |
| 2 | 21154544 | Huge beautiful bedroom with double exposure | 66260832 | Dragana | Manhattan | Harlem | 40.81520 | -73.95175 | Private room | 50 | 15 | 0 | NaN | NaN | 1 | 0 |
| 3 | 2135489 | Charming Studio in Brooklyn | 8624212 | Leon | Brooklyn | Carroll Gardens | 40.68362 | -73.99714 | Entire home/apt | 170 | 2 | 131 | 2019-06-16 | 2.02 | 1 | 26 |
| 4 | 11321187 | Entire 1Br Apt on UES | 59215698 | Daniela | Manhattan | Upper East Side | 40.76796 | -73.95205 | Entire home/apt | 130 | 1 | 21 | 2019-07-07 | 0.56 | 2 | 53 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 123 | 15765350 | Doorman 2 Bed GYM DECK 5212 | 16098958 | Jeremy & Laura | Manhattan | Murray Hill | 40.74437 | -73.97295 | Entire home/apt | 190 | 30 | 4 | 2019-01-14 | 0.13 | 96 | 331 |
| 124 | 23307047 | 2 Story PRIVATE Duplex/Elevator Building in NoMad | 172756149 | C | Manhattan | Kips Bay | 40.74121 | -73.98139 | Entire home/apt | 240 | 1 | 66 | 2019-07-02 | 4.09 | 1 | 83 |
| 125 | 6169068 | Prime Park Slope Townhouse, 4 BR and Garden | 9773128 | Deborah | Brooklyn | Park Slope | 40.66798 | -73.97610 | Entire home/apt | 345 | 30 | 0 | NaN | NaN | 1 | 156 |
| 126 | 24122599 | Cozy room in a Victorian house in Central Broo... | 14905006 | Myriam | Brooklyn | Kensington | 40.63966 | -73.97160 | Private room | 52 | 1 | 11 | 2019-05-27 | 0.81 | 1 | 0 |
| 127 | 18616234 | UPPER EAST SIDE PRIVATE ROOM! | 129317761 | Lauren | Manhattan | East Harlem | 40.78886 | -73.94324 | Private room | 70 | 1 | 11 | 2017-07-15 | 0.42 | 1 | 0 |
128 rows × 16 columns
x_data,y_data = next(tnn.batch_array())
/Users/salvor/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:110: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
x_data
{'name': tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]),
'host_id': tensor([ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24,
0, 0, 0, 0, 0, 153, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 185, 0, 0, 133, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 92, 0, 0, 20,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 106, 0, 0,
0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 75, 0, 0, 0, 237, 8, 0,
224, 0, 345, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 6, 0, 0, 0, 0, 47, 464, 0, 0, 6, 0, 0,
0, 0]),
'neighbourhood_group': tensor([1, 2, 1, 2, 1, 1, 1, 2, 2, 2, 2, 1, 3, 1, 2, 1, 3, 2, 3, 5, 1, 2, 2, 1,
2, 2, 2, 1, 3, 1, 3, 1, 1, 2, 2, 3, 2, 3, 2, 1, 1, 2, 2, 2, 3, 1, 1, 1,
1, 3, 1, 1, 2, 1, 2, 2, 2, 2, 1, 2, 2, 3, 1, 1, 1, 1, 2, 2, 1, 3, 2, 1,
2, 2, 1, 3, 1, 1, 2, 1, 1, 2, 1, 1, 1, 3, 1, 1, 2, 1, 2, 1, 1, 2, 3, 3,
1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 3, 3, 3, 4, 1, 1, 2, 2, 1, 2, 2, 1, 1, 2,
1, 2, 2, 1, 1, 2, 2, 1]),
'neighbourhood': tensor([ 5, 20, 3, 46, 8, 8, 11, 54, 4, 2, 4, 17, 65, 5,
40, 8, 121, 2, 21, 126, 13, 73, 4, 3, 1, 4, 2, 8,
15, 11, 15, 18, 62, 4, 2, 49, 61, 28, 1, 7, 10, 9,
4, 4, 76, 11, 17, 34, 5, 93, 42, 26, 53, 16, 1, 2,
4, 1, 7, 9, 9, 105, 27, 42, 5, 32, 1, 2, 17, 28,
4, 26, 64, 25, 13, 15, 13, 14, 1, 7, 10, 9, 5, 14,
14, 188, 3, 6, 2, 6, 4, 10, 7, 61, 50, 105, 26, 6,
34, 55, 11, 109, 18, 8, 1, 5, 33, 50, 33, 85, 3, 81,
182, 2, 5, 40, 12, 3, 8, 43, 3, 1, 2, 26, 27, 23,
53, 11]),
'latitude': tensor([[40.7789],
[40.6944],
[40.8152],
[40.6836],
[40.7680],
[40.7812],
[40.7971],
[40.6017],
[40.6894],
[40.6890],
[40.7038],
[40.7342],
[40.7312],
[40.7983],
[40.6618],
[40.7770],
[40.7171],
[40.6832],
[40.7623],
[40.5976],
[40.7401],
[40.6804],
[40.7025],
[40.8062],
[40.7152],
[40.7018],
[40.6958],
[40.7754],
[40.7735],
[40.8021],
[40.7659],
[40.7083],
[40.7189],
[40.6975],
[40.6918],
[40.7520],
[40.6809],
[40.7589],
[40.7197],
[40.7258],
[40.7642],
[40.6773],
[40.7077],
[40.7032],
[40.5942],
[40.7860],
[40.7368],
[40.7206],
[40.7892],
[40.5841],
[40.8611],
[40.7512],
[40.6439],
[40.8567],
[40.7074],
[40.6923],
[40.7038],
[40.7179],
[40.7335],
[40.6754],
[40.6703],
[40.6708],
[40.7393],
[40.8671],
[40.7710],
[40.7140],
[40.7210],
[40.6811],
[40.7356],
[40.7447],
[40.6882],
[40.7474],
[40.6200],
[40.6911],
[40.7463],
[40.7587],
[40.7390],
[40.7175],
[40.7121],
[40.7335],
[40.7520],
[40.6736],
[40.8011],
[40.7224],
[40.7230],
[40.7797],
[40.8243],
[40.7612],
[40.6819],
[40.7643],
[40.7026],
[40.7524],
[40.7310],
[40.6814],
[40.7583],
[40.6717],
[40.7474],
[40.7567],
[40.7235],
[40.6596],
[40.7907],
[40.7045],
[40.7072],
[40.7640],
[40.7151],
[40.7922],
[40.7368],
[40.7607],
[40.7476],
[40.8537],
[40.8273],
[40.7136],
[40.5753],
[40.6928],
[40.7938],
[40.6635],
[40.7261],
[40.8128],
[40.7804],
[40.6773],
[40.8077],
[40.7168],
[40.6838],
[40.7444],
[40.7412],
[40.6680],
[40.6397],
[40.7889]]),
'longitude': tensor([[-73.9767],
[-73.9661],
[-73.9518],
[-73.9971],
[-73.9520],
[-73.9499],
[-73.9347],
[-73.9630],
[-73.9088],
[-73.9364],
[-73.9192],
[-74.0062],
[-73.8666],
[-73.9612],
[-73.9822],
[-73.9444],
[-73.8241],
[-73.9170],
[-73.9297],
[-74.0835],
[-74.0007],
[-74.0173],
[-73.9206],
[-73.9553],
[-73.9430],
[-73.9283],
[-73.9408],
[-73.9503],
[-73.9265],
[-73.9436],
[-73.9083],
[-74.0064],
[-73.9976],
[-73.9350],
[-73.9286],
[-73.8689],
[-73.8889],
[-73.8210],
[-73.9602],
[-73.9874],
[-73.9770],
[-73.9454],
[-73.9231],
[-73.9146],
[-73.8004],
[-73.9509],
[-74.0024],
[-73.9981],
[-73.9676],
[-73.8167],
[-73.9296],
[-73.9774],
[-73.9813],
[-73.9302],
[-73.9501],
[-73.9568],
[-73.9271],
[-73.9577],
[-73.9890],
[-73.9566],
[-73.9526],
[-73.7916],
[-73.9776],
[-73.9281],
[-73.9894],
[-73.9977],
[-73.9601],
[-73.9129],
[-74.0060],
[-73.8250],
[-73.9163],
[-73.9784],
[-73.9549],
[-73.9736],
[-73.9914],
[-73.9174],
[-73.9975],
[-73.9915],
[-73.9584],
[-73.9878],
[-73.9725],
[-73.9138],
[-73.9652],
[-73.9877],
[-73.9925],
[-73.7786],
[-73.9451],
[-73.9917],
[-73.9581],
[-73.9891],
[-73.9170],
[-73.9715],
[-73.9848],
[-73.8942],
[-73.8801],
[-73.7967],
[-73.9794],
[-73.9977],
[-74.0041],
[-73.9838],
[-73.9397],
[-73.9858],
[-74.0104],
[-73.9637],
[-73.9523],
[-73.9741],
[-73.9244],
[-73.8794],
[-73.9122],
[-73.9017],
[-73.9453],
[-74.0176],
[-74.0052],
[-73.9442],
[-73.9659],
[-73.9844],
[-73.9481],
[-73.9516],
[-73.9466],
[-73.9837],
[-73.9550],
[-73.9562],
[-73.9215],
[-73.9730],
[-73.9814],
[-73.9761],
[-73.9716],
[-73.9432]]),
'room_type': tensor([1, 1, 2, 1, 1, 1, 2, 2, 2, 1, 2, 1, 2, 2, 1, 1, 2, 3, 1, 2, 1, 1, 2, 1,
2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2,
1, 1, 1, 1, 2, 3, 2, 2, 2, 1, 2, 1, 2, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 2,
1, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2, 2, 1, 3, 2, 2, 1, 2, 1, 2, 2, 1, 2, 3,
1, 2, 2, 1, 2, 1, 1, 1, 1, 1, 2, 2, 1, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2,
2, 2, 1, 1, 1, 1, 2, 2]),
'minimum_nights': tensor([[ 3.],
[ 1.],
[15.],
[ 2.],
[ 1.],
[50.],
[ 6.],
[ 1.],
[ 1.],
[ 4.],
[ 2.],
[ 9.],
[ 2.],
[30.],
[ 4.],
[ 2.],
[ 1.],
[ 1.],
[ 2.],
[ 4.],
[ 3.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 4.],
[ 2.],
[ 4.],
[ 1.],
[ 1.],
[ 1.],
[29.],
[ 2.],
[ 2.],
[ 2.],
[ 2.],
[ 1.],
[ 1.],
[ 2.],
[ 5.],
[ 6.],
[ 3.],
[ 7.],
[14.],
[ 2.],
[ 6.],
[ 2.],
[ 3.],
[ 6.],
[ 2.],
[26.],
[30.],
[ 1.],
[ 2.],
[ 3.],
[ 2.],
[14.],
[ 7.],
[ 2.],
[ 5.],
[ 1.],
[ 1.],
[ 5.],
[ 2.],
[ 2.],
[ 2.],
[ 1.],
[ 5.],
[ 1.],
[ 3.],
[ 2.],
[10.],
[ 5.],
[ 2.],
[ 1.],
[ 2.],
[30.],
[ 2.],
[ 3.],
[ 1.],
[ 3.],
[ 2.],
[ 2.],
[ 1.],
[ 2.],
[ 2.],
[ 2.],
[22.],
[ 3.],
[ 1.],
[ 2.],
[ 3.],
[ 1.],
[ 5.],
[ 1.],
[ 4.],
[30.],
[ 2.],
[ 1.],
[ 3.],
[ 3.],
[ 2.],
[ 1.],
[ 3.],
[ 2.],
[30.],
[ 7.],
[ 1.],
[ 2.],
[ 1.],
[ 2.],
[ 1.],
[ 4.],
[ 3.],
[30.],
[ 1.],
[ 3.],
[ 3.],
[ 5.],
[ 1.],
[ 1.],
[ 1.],
[ 2.],
[30.],
[ 1.],
[30.],
[ 1.],
[ 1.]]),
'number_of_reviews': tensor([[ 23.2745],
[ 1.0000],
[ 23.2745],
[131.0000],
[ 21.0000],
[ 23.2745],
[ 14.0000],
[ 3.0000],
[ 23.2745],
[ 23.2745],
[ 23.2745],
[ 23.2745],
[ 26.0000],
[ 1.0000],
[ 2.0000],
[ 4.0000],
[ 10.0000],
[ 1.0000],
[ 16.0000],
[ 6.0000],
[ 23.2745],
[ 50.0000],
[ 23.2745],
[ 11.0000],
[ 17.0000],
[ 1.0000],
[ 51.0000],
[147.0000],
[ 2.0000],
[ 1.0000],
[ 30.0000],
[ 23.2745],
[ 44.0000],
[ 5.0000],
[ 9.0000],
[ 14.0000],
[ 91.0000],
[ 39.0000],
[ 24.0000],
[166.0000],
[ 3.0000],
[ 23.2745],
[ 28.0000],
[ 23.2745],
[ 35.0000],
[ 4.0000],
[ 47.0000],
[ 21.0000],
[ 21.0000],
[ 13.0000],
[ 4.0000],
[ 2.0000],
[ 52.0000],
[ 23.2745],
[ 2.0000],
[ 12.0000],
[ 23.2745],
[ 23.2745],
[ 3.0000],
[ 1.0000],
[ 23.2745],
[154.0000],
[ 13.0000],
[120.0000],
[ 5.0000],
[ 3.0000],
[ 12.0000],
[ 23.2745],
[ 23.2745],
[ 95.0000],
[ 92.0000],
[ 23.2745],
[ 15.0000],
[ 73.0000],
[ 23.2745],
[ 2.0000],
[ 23.2745],
[ 23.2745],
[ 4.0000],
[ 39.0000],
[ 17.0000],
[ 49.0000],
[ 14.0000],
[ 1.0000],
[168.0000],
[ 6.0000],
[ 28.0000],
[ 1.0000],
[ 1.0000],
[147.0000],
[ 3.0000],
[ 1.0000],
[114.0000],
[ 2.0000],
[ 12.0000],
[ 4.0000],
[ 23.2745],
[290.0000],
[ 23.2745],
[ 3.0000],
[ 4.0000],
[150.0000],
[ 8.0000],
[ 1.0000],
[ 59.0000],
[ 23.2745],
[ 23.2745],
[296.0000],
[ 15.0000],
[ 1.0000],
[ 9.0000],
[ 23.2745],
[ 23.2745],
[ 40.0000],
[ 1.0000],
[ 7.0000],
[ 3.0000],
[ 3.0000],
[ 1.0000],
[ 28.0000],
[ 95.0000],
[ 38.0000],
[ 23.2745],
[ 4.0000],
[ 66.0000],
[ 23.2745],
[ 11.0000],
[ 11.0000]]),
'reviews_per_month': tensor([[ 1.3732],
[ 0.0400],
[ 1.3732],
[ 2.0200],
[ 0.5600],
[ 1.3732],
[ 0.3500],
[ 1.1500],
[ 1.3732],
[ 1.3732],
[ 1.3732],
[ 1.3732],
[ 1.9600],
[ 0.7900],
[ 0.2800],
[ 0.4900],
[10.0000],
[ 0.0400],
[ 0.4400],
[ 0.4100],
[ 1.3732],
[ 0.9200],
[ 1.3732],
[ 0.5100],
[ 0.5500],
[ 0.0200],
[ 0.8600],
[ 2.7600],
[ 0.0700],
[ 1.0000],
[ 5.8100],
[ 1.3732],
[ 5.6400],
[ 0.7900],
[ 3.4200],
[ 5.0000],
[ 1.8100],
[ 3.4900],
[ 0.5600],
[ 1.6800],
[ 0.2200],
[ 1.3732],
[ 2.7800],
[ 1.3732],
[ 0.9500],
[ 0.0600],
[ 1.1600],
[ 0.6300],
[ 1.8300],
[ 1.5400],
[ 0.3500],
[ 0.3800],
[ 2.4700],
[ 1.3732],
[ 0.0600],
[ 1.3500],
[ 1.3732],
[ 1.3732],
[ 0.0800],
[ 0.0400],
[ 1.3732],
[ 8.2400],
[ 1.2300],
[ 1.2700],
[ 4.2900],
[ 0.2100],
[ 0.5300],
[ 1.3732],
[ 1.3732],
[ 3.5200],
[ 4.3300],
[ 1.3732],
[ 1.0100],
[ 1.3900],
[ 1.3732],
[ 0.0800],
[ 1.3732],
[ 1.3732],
[ 0.4800],
[ 2.0000],
[ 2.4900],
[ 3.3100],
[ 1.0600],
[ 0.0200],
[ 3.9300],
[ 0.9500],
[ 0.8200],
[ 0.0900],
[ 0.1600],
[ 5.4500],
[ 0.0600],
[ 0.0300],
[ 7.9700],
[ 0.0600],
[ 9.7300],
[ 0.4100],
[ 1.3732],
[ 5.9200],
[ 1.3732],
[ 3.0000],
[ 3.0000],
[ 2.5700],
[ 0.3400],
[ 0.0500],
[ 3.2800],
[ 1.3732],
[ 1.3732],
[10.6000],
[ 1.0200],
[ 1.0000],
[ 0.2900],
[ 1.3732],
[ 1.3732],
[ 1.9900],
[ 0.0500],
[ 0.1200],
[ 0.0600],
[ 0.3900],
[ 1.0000],
[ 1.4400],
[ 3.9300],
[ 0.6300],
[ 1.3732],
[ 0.1300],
[ 4.0900],
[ 1.3732],
[ 0.8100],
[ 0.4200]]),
'calculated_host_listings_count': tensor([[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 32.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 8.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[327.],
[ 2.],
[ 1.],
[ 8.],
[ 1.],
[ 2.],
[ 9.],
[ 1.],
[ 2.],
[ 1.],
[ 2.],
[ 3.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 1.],
[ 91.],
[ 11.],
[ 1.],
[ 1.],
[ 34.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 3.],
[ 10.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[232.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 3.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 3.],
[ 2.],
[ 12.],
[ 1.],
[ 1.],
[ 1.],
[ 7.],
[ 87.],
[ 1.],
[ 7.],
[ 2.],
[ 6.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 3.],
[ 96.],
[ 1.],
[ 1.],
[ 2.],
[ 1.],
[ 17.],
[ 5.],
[ 1.],
[ 1.],
[ 96.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]]),
'availability_365': tensor([[ 10.0000],
[112.7813],
[112.7813],
[ 26.0000],
[ 53.0000],
[112.7813],
[335.0000],
[112.7813],
[175.0000],
[112.7813],
[112.7813],
[112.7813],
[180.0000],
[335.0000],
[179.0000],
[112.7813],
[160.0000],
[179.0000],
[112.7813],
[285.0000],
[252.0000],
[ 19.0000],
[ 5.0000],
[ 18.0000],
[354.0000],
[112.7813],
[179.0000],
[310.0000],
[112.7813],
[ 17.0000],
[ 85.0000],
[342.0000],
[275.0000],
[128.0000],
[268.0000],
[ 32.0000],
[255.0000],
[342.0000],
[112.7813],
[365.0000],
[351.0000],
[112.7813],
[ 28.0000],
[112.7813],
[ 4.0000],
[112.7813],
[361.0000],
[ 82.0000],
[116.0000],
[ 73.0000],
[ 58.0000],
[241.0000],
[ 42.0000],
[363.0000],
[112.7813],
[305.0000],
[112.7813],
[112.7813],
[112.7813],
[112.7813],
[112.7813],
[112.7813],
[ 24.0000],
[132.0000],
[ 94.0000],
[ 5.0000],
[112.7813],
[356.0000],
[112.7813],
[ 63.0000],
[ 4.0000],
[ 19.0000],
[105.0000],
[ 20.0000],
[112.7813],
[ 87.0000],
[320.0000],
[ 23.0000],
[112.7813],
[112.7813],
[ 33.0000],
[210.0000],
[ 8.0000],
[112.7813],
[ 65.0000],
[169.0000],
[ 80.0000],
[187.0000],
[112.7813],
[174.0000],
[359.0000],
[365.0000],
[ 38.0000],
[112.7813],
[112.7813],
[365.0000],
[365.0000],
[ 26.0000],
[331.0000],
[302.0000],
[ 90.0000],
[309.0000],
[ 54.0000],
[ 5.0000],
[ 81.0000],
[358.0000],
[112.7813],
[322.0000],
[ 27.0000],
[363.0000],
[112.7813],
[365.0000],
[ 19.0000],
[ 70.0000],
[250.0000],
[112.7813],
[110.0000],
[112.7813],
[ 23.0000],
[362.0000],
[224.0000],
[157.0000],
[112.7813],
[331.0000],
[ 83.0000],
[156.0000],
[112.7813],
[112.7813]])}
tnn.model.dnn
Sequential( (0): Linear(in_features=107, out_features=107, bias=True) (1): BatchNorm1d(107, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU() (3): Linear(in_features=107, out_features=1, bias=True) (4): BatchNorm1d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) )
ipts = list(tnn.model.inputs[xcol.name](x_data[xcol.name]) for xcol in tnn.model.rdf.Xs)
concat = torch.cat(ipts,dim=1)
concat,concat.shape
(tensor([[-0.3061, 0.5583, -1.7048, ..., -0.1683, -0.2213, -0.8510],
[-0.3061, 0.5583, -1.7048, ..., -0.6799, -0.1963, -0.3392],
[-0.3061, 0.5583, -1.7048, ..., -0.1683, -0.2213, -0.3392],
...,
[-0.3061, 0.5583, -1.7048, ..., -0.1683, -0.2213, 0.0281],
[-0.3061, 0.5583, -1.7048, ..., -0.4205, -0.2213, -0.3392],
[-0.3061, 0.5583, -1.7048, ..., -0.5656, -0.2213, -0.3392]],
grad_fn=<CatBackward>), torch.Size([128, 107]))
next(tnn.batch_y_pred())
/Users/salvor/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:110: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
tensor([[ 0.1854],
[ 0.4965],
[ 0.0657],
[ 0.5390],
[ 0.0346],
[ 0.1109],
[ 1.0257],
[ 0.9426],
[-1.4236],
[-0.4027],
[-1.3949],
[ 1.0144],
[ 1.6111],
[ 0.4727],
[-0.3963],
[ 0.0993],
[ 1.8734],
[-0.5209],
[ 1.1702],
[-0.6602],
[ 0.9155],
[-0.1697],
[-1.4661],
[ 0.3416],
[ 0.0491],
[-1.4276],
[-0.3420],
[ 0.0853],
[ 0.3223],
[ 1.3579],
[ 0.5943],
[ 0.3173],
[-0.9157],
[-1.4335],
[-1.6335],
[-0.1290],
[-0.4366],
[ 2.6865],
[-0.5450],
[-0.8102],
[ 0.2763],
[-1.5096],
[-1.4165],
[-1.3428],
[-0.5520],
[ 1.3585],
[ 0.8928],
[ 0.5567],
[ 0.2359],
[-0.9065],
[ 0.3392],
[-0.8105],
[-1.2481],
[ 0.7013],
[ 0.2318],
[-0.7692],
[-1.3154],
[-0.5138],
[-1.1481],
[-1.4941],
[-1.3265],
[ 2.1126],
[ 0.6395],
[ 0.6270],
[ 0.3102],
[ 0.2761],
[-0.5575],
[-2.1002],
[ 1.0423],
[ 2.4838],
[-1.5184],
[ 0.7619],
[-0.4804],
[-1.2091],
[ 0.2233],
[ 0.7135],
[ 0.5420],
[-0.5502],
[ 0.2483],
[-1.1345],
[ 0.4551],
[-1.6280],
[ 0.8449],
[-0.4392],
[-0.2708],
[ 0.9606],
[ 0.2369],
[ 0.8148],
[-0.3557],
[ 0.7816],
[-1.4975],
[ 1.1632],
[-1.2288],
[-0.2945],
[ 1.4050],
[ 2.5949],
[ 0.6319],
[ 0.7928],
[-1.1700],
[-0.2946],
[ 2.8332],
[-0.5895],
[ 0.3957],
[ 0.0810],
[-0.5038],
[ 0.0837],
[ 0.9681],
[ 1.1302],
[-0.0971],
[ 0.5286],
[ 0.3181],
[-0.8433],
[-0.8857],
[-1.8803],
[ 0.2565],
[-0.3163],
[-1.0606],
[ 0.2665],
[ 0.0986],
[-0.5930],
[ 0.9827],
[ 0.1503],
[-0.4754],
[ 0.1420],
[ 0.5774],
[ 0.1458],
[-1.0844],
[ 0.9949]], grad_fn=<NativeBatchNormBackward>)