|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
||||
|
||||
|
Expert help needed...
I have a potentially very large table in an application I am building. I was wondering if I could get some input on what people perceive as the best method.
I want to be able to store up to 24 months of room availability for as many as 1M (possibly even more) hotel rooms. Basically I need 730 days per room. The room data is stored as a 3 characters/digits that will represent availability (3 possible), type(5 possible), special(4 possible). I was planning on storing the data as either an integer "324" or a short string "ab2" but what I can't decide is how best to design my table. I am clearly not going to make 740 fields per room (I need 3 fields to store the room id, the facility id and 8 fileds to store the room price/features) So the problem is, do I store the room data in one table and make 730 records in a 3 field table: availability ============ roomid, date, info or do I do something like this: make the room table have 24 month fields: store the room status for the whole month as one string I can explode: 234/234/234/134/134/132/132/ etc. in a char(124) field this would limit my table to 1M records with 34 fields each. the other way I could have 730M+ records of 3 fields each. there is a definate advantage to the first when retrieving information about rooms by date I could pull back all the rooms in a particular set of facilities that had a status > 300 which would indicate available 1= reserved, 2=confirmed(reserved and paid for) 3=available, but I am still leaning towards the second. To retrieve the same data I would have to pull back the full month data for each room in each facility in the set and either explode it or grab the substr() that represented the day in question which is clearly going to be slower but the data seems more compact this way. It seems very likely it will also be faster when looking at a particular facility to retreive the data for that facility from a table with 1M records than a table with 730M records so performance wise it might be a wash. I guess I could test it at some point but I would need a large set of test data which I could probably make with a couple of for loops. Hope this makes sense. |
|
#2
|
||||
|
||||
|
RE: How would you design this?
I'd recommend combining data as little as possible. The more you combine data into fields and similar fields into rows, the more work you need to do outside the DB and the less the DB gets to do what it's made for. It's also much more difficult to add effective indices.
I'd suggest a room table that has one row for each room and an availability table that has one row for each room and date. In the availability table, you should really split the "info" field into three fields of one character each. It will make your queries much easier to read and maintain, let alone faster - especially since indexing on availability (if that makes sense for you queries) will be much easier. Additionally, it seems like you could reduce the size of your availability talbe by only having records when a room is not available on some date. So, no record found (or NULL values with an OUTER JOIN) would simply mean the room is avialable on that date. |
|
#3
|
||||
|
||||
|
RE: How would you design this?
Unfortunately I have to carry a room field for every date because the special code has to be set by the manager in advance. There is no really good way to manage this without an entry for each date unless i determine what's the most common status and eliminate those entries when the room is available and has that status. If I go with an individual entry for each day for each room I can easily break the type, availability and special codes into their own fields but I'm still looking at 730M+ records in that table which worries me a little. Even indexed by date I think it's going to be a little slow.
|
|
#4
|
||||
|
||||
|
RE: How would you design this?
Presuming your server has enough memory to effectively use indices, I don't think the performance difference between 1M and 730M rows is as dramatic as you think. Look at MySQL's Estimating Query Performance page.
Admittedly, 730M records would require a lot of memory for any size index, but maybe you have a huge server to run this on. |
|
#5
|
||||
|
||||
|
RE: How would you design this?
hmmm, that looks as if I'm going to have a huge slowdown with 730M records or even at 5-10M not to mention the size of the index is going to grow exponentially. I still think I might need to go with the 1 record per room method. It will cut down on the effectiveness of the search for available rooms but it will keep the much more frequent pulls of rooms by facility in working order.
|
|
#6
|
||||
|
||||
|
Expert help needed
I was looking for more opinions on this one, if possible.
|
|
#7
|
|||
|
|||
|
RE: Expert help needed...
Would there be any advantage to say, partial or functional indexes? i.e.
create table rooms ( id int status1 text, status2 text, status3 text ); create index rooms_stat1_r on rooms where status1='r'; create index rooms_stat1_g on rooms where status1='g'; create index rooms_stat1_b on rooms where status1='b'; create index rooms_stat2_3 on rooms where status1='3'; then a query like: select * from room where status1='r' and status2='3'; would both have fairly small, and 100% correlated indexs at the ready. This is a postgresql feature, not sure if MySQL supports it. |
|
#8
|
|||
|
|||
|
RE: Expert help needed...
And, sometimes a bigger index is better, like so...
create index rooms_dx on rooms (stat1, stat2, stat3); Can be used by most databases in a: select * from rooms where stat1='r' order by stat2; Note that if you don't need the have the third column, leave it out, as it will make the index much bigger than just a two column index. |
![]() |
| Viewing: Codewalkers Forums > Other Technologies > Database Help > Expert help needed... |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|