Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

When hashes aren't enough

by Sprad (Hermit)
on May 25, 2004 at 15:23 UTC ( [id://356253]=perlquestion: print w/replies, xml ) Need Help??

Sprad has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to store some information about some items, where each item has a set of attributes. Something like:
Car Wheels=4 Doors=4 Color=Blue Bike Wheels=2 Color=Red
In the actual data, there'll be between 20-30 attributes for each item. Some items don't use certain attributes, those would either be left out entirely or just set to a null value.

I could use a hash for each attribute and access it with $Color{"Car"}, or use a hash of hashes, but is there a really nice neat way to do stuff like this? Am I treading so close to OOP that I just need to take the plunge now?

A fair fight is a sign of poor planning.

Replies are listed 'Best First'.
Re: When hashes aren't enough
by dragonchild (Archbishop) on May 25, 2004 at 15:26 UTC
    Use a hash of hashes (aka, HoH). Yeah, you're treading close to OO, but I'd take the plunge when the plunge is warranted. You're building a 2-level lookup table. Nothing more, nothing less. OO is for when your things do stuff. These things don't do much, other than retain knowledge.

    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      While it is true that *real* OO is nice when objects do stuff, OOP in Perl, even if halfway done without doing stuff (it can do stuff, but doesn't have to), is a decent way to implement C structures in a halfway clean sort of way without having to deal with the ugly gorp that is an HoH. That being said, my objects contain HoH's all the time, but I really hate it when outside methods must delve into them each time the hard way, or to iterate through their keys.

      I suggest you take the plunge, but retain the knowledge of the old ways. Properly mixed, they are powerful. You never really have to go "pure OO" if you want to, especially not in Perl. Trivial OO (i.e. objects that just work like C glorified structs) is ok for starters until you decide to add more real-OO functionality. Of course, OO isn't really OO until you are using inheritance and other fancy over-hyped concepts :)

      Note that over half of the developers that think they are doing "real OO" are just using overglorified structs -- which means (yes), you can do that in C or pretty much in any language. The next step is to mix functional, procedural, and OO styles all in one, which sounds goofy -- but it's kind of cool at the same time. Anyhow, go forth, and take the plunge. The water's fine. (Just beware the alligators -- such as folks who profess fancy "design patterns" and don't have enough common sense to keep things simple when they can be simple).

Re: When hashes aren't enough
by Stevie-O (Friar) on May 25, 2004 at 15:47 UTC
    Using OOP doesn't actually require packages and bless()ing. It just means you have to do something in an object-oriented way :)

    With that in mind, it depends on what you're using the information *for*. For example, if you only need to process the color information for everything at once, using $color{'car'} and $color{'bike'} make perfect sense.

    On the other hand, if you needed to keep the different attributes of an item together (for example, for storage to a file on disk), you might prefer to do this:

    $car = { wheels => 4, doors => 4, color => 'blue', trans => 'auto' }; $bike = { wheels => 2, color => 'red', };
    The reason for doing this is that, with all the attributes of an item in one hash, it's easy to pass that item around to whatever might need it. For example, you can pass it to Storable's freeze method and preserve your car in a disk file (for thawing by another script, or a later instance of your own). Or, you could pass a hash to a function that does something generic to whatever is passed in (like telling you if its turn signals automatically shut off; vehicles without steering wheels don't have that). This sort of design is at the heart of OO.
    $"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc
Re: When hashes aren't enough
by fletcher_the_dog (Friar) on May 25, 2004 at 15:51 UTC
    if you don't want to do OO than building a hash of hashes could be pretty easy to work with. You could do something like this:
    my %vehicles = ( Car=>{ Wheels=>4, Doors=>4, Color=>"Blue", }, Bike=>{ Wheels=>2, Color=>"Red", } ); # work with the attributes while (my($vehicle,$attributes)=each(%vehicles)){ print "Let me tell you about a $vehicle\n"; if(my $color = $attributes->{Color}){ print "A $vehicle is $color\n"; } if(my $wheels = $attributes->{Wheels}){ print "A $vehicle has $wheels wheels\n"; } }
Re: When hashes aren't enough
by dimar (Curate) on May 25, 2004 at 16:10 UTC

    Short Answer

    You can use references to store any concievable kind of data structure in perl.

    ### INIT pragma use strict; use warnings; ### INIT vars my $data_root = {}; $data_root->{car} = { wheels => 4, doors => 4, color => 'blue', }; $data_root->{bike} = { wheels => 2, doors => undef, color => 'blue', }; ### DISPLAY print $data_root->{car}{color}; print "\n---------------------\n"; print $data_root->{bike}{color}; print "\n---------------------\n"; print $data_root->{$_}{wheels},"wheels "foreach(keys %{$data_root} +); print "\n---------------------\n";

    Long Answer: what is your goal

    Am I treading so close to OOP that I just need to take the plunge now?
    It depends on what you intend to do. Are you storing this data to keep an inventory database of merchandise? Are you creating a structured entity with properties (eg print $color{car}) and methods (eg $car->drive('50mph') )? Are you rolling your own format for an initialization file? Are you ever going to need more than one 'car' or 'bike'? Are you ever going to need nested data (eg a car with a bike in the trunk)?

    OOP may not even be relevant if you are simply looking for a way to store some text data.

    See what else is already out there

    You can always search on CPAN, you may find something that already fits the bill perfectly. Also, by examining someone else's approach to your goal, you will discover answers to question you hadn't even considered yet. All in all, it is a great way to learn new stuff.

Re: When hashes aren't enough
by talexb (Chancellor) on May 25, 2004 at 17:14 UTC

    I can highly recommand YAML for storing this kind of information.

    I use it to store configuration information for my web application, and it's lightweight and very flexible.

    Alex / talexb / Toronto

    Life is short: get busy!

      Here's nother vote for YAML. In fact, if this is a strictly Perl application with no interop issues, you're almost certain to be better off with YAML than XML. Hands down.

Re: When hashes aren't enough
by jaa (Friar) on May 25, 2004 at 17:28 UTC

    OO - as other have said... 'it depends'. I find the question a little unclear, you have items with attributes...

    Car Wheels=4 Doors=4 Color=Blue Bike Wheels=2 Color=Red
    Which are easily encoded as hashes...
    my %item = ( Car => { Wheels => 4, Doors => 4, Color => 'Blue', }, Bike => { Wheels => 2, Color => 'Red' }, );
    but then you look for $Color{"Car"}... rather than $Item{Car}{Color}

    so how exactly do you want to use $Item? Do you just want to store the details somewhere? or do you want attribute indexes...

    my %item = ( Car => { Wheels => 4, Doors => 4, Color => 'Blue', }, Bike => { Wheels => 2, Color => 'Red' }, Moped => { Wheels => 2, Color => 'Blue' }, ); # build an index by attribute to item names... my %attr; for my $name ( keys %item ) { for my $token ( keys %{$item{$name}} ) { $attr{$token}{$item{$name}{$token}} ||= []; push @{$attr{$token}{$item{$name}{$token}}}, $name; } } print "Blue things: @{$attr{Color}{Blue}}\n";
    Which is available on CPAN Tie-Hash-TwoWay



Re: When hashes aren't enough
by xorl (Deacon) on May 25, 2004 at 15:30 UTC
    Personally I'd do it just like you're thinking (i.e. $Color{"Car"} = "Blue"). You could also make each item a hash (i.e. $Car{"Color"} = "Blue") but that might be more problematic in the future.

    I haven't figured out OOP yet so, maybe someone else can enlighten us both.

    It might help to know more about what you're trying to do with this information. Why can't you just store it in a database?

Re: When hashes aren't enough
by punkish (Priest) on May 26, 2004 at 04:09 UTC
    Some questions to consider --
    • How big is the final dataset going to be?
    • How many users will be accessing it simlutaneously?
    • Is it readonly or read/write?
    • What kind of queries would you want to do?
    • Will it be stored in the program or externally?
    Some choices --
    • AoH in the program
    • flat text file
    • XML
    • use DBD::Sprite
    • SQLite
Re: When hashes aren't enough
by BigLug (Chaplain) on May 26, 2004 at 12:10 UTC
    Many wiser monks have had their say, so let me add my own $0.02.

    If you're just storing data, use a Hash of Hashes (countles examples above).

    If your data should be intelligent, use Objects. Consider and example similar to that in perltoot:

    You run a business, you employ people (class 'Employee'). They all do different things, they all get paid different amounts and all work different hours. So, in procedural programming we'd have to know all this and branch around the place like crazy. With OO, it's all under the bonnet and we dont need to touch anything:

    $roger = new Employee::Casual( rate => 14.00 ); $sally = new Employee::FullTime( salary => 50_000 ); $mark = new Employee::FullTime( wage => 12.00 ); $roger->add_hours( 8 ); # worked 8 hours $sally->add_hours( 8 ); # worked 8 hours $mark->add_hours( 8 ); # worked 8 hours # Now pay them $roger->write_check(); # 14 * 8 = $112.00 $sally->write_check(); # $0.00 - nothing due to a full timer until +fortnight is accounted for $mark->write_check(); # $0.00 - same here # add another 11 days # ... same as above, 11 times ... # then on day 13, it's a long day $roger->add_hours( 10 ); # worked 10 hours $sally->add_hours( 10 ); # worked 10 hours $mark->add_hours( 10 ); # worked 10 hours # So then they take day 14 off $roger->sick_leave( 8 ); # 8 hours $sally->sick_leave( 8 ); # 8 hours $mark->sick_leave( 8 ); # 8 hours # and get a work log: $roger->write_check(); # 11 days * 8 hours + 1 day * 10 hours = 98 hours @ $14.00 = $13 +72 # (no sick pay for casuals and we've already paid him for day 1) $sally->write_check(); # 14 days @ ($50000 / 365 days) = $1917 # (no overtime for sallary earners) $mark->write_check(); # 12 days * 8 hours + 1 day * 10 hours = 106 hours @ $12.00 = $1 +272 # get entitlements $roger->leave_entitlements(); # {sick_leave => 0, annual_leave => 0 +} - none for the casual $sally->leave_entitlements(); # {sick_leave => 56, annual_leave => + 8.6} $mark->leave_entitlements(); # {sick_leave => 56, annual_leave => +8.6}
    As you can see, the objects are intelligent, so it doesn't matter that we tell all the objects ($roger, $sally and $mark) exactly the same thing, they all know, under the hood, what to do.

    To learn how to do that, read perltoot. It's not only a good tutorial, it's an easy read and you'll come away having learned something about perl that you didn't know before. Plus, learning OO is very PerlMonk.

    bless($me, $ISA[0]) for $i (sin($have));
    "Get real! This is a discussion group, not a helpdesk. You post something, we discuss its implications. If the discussion happens to answer a question you've asked, that's incidental." -- in clpm
Re: When hashes aren't enough
by Ninthwave (Chaplain) on May 25, 2004 at 15:36 UTC
    Why not XML
    <car> <wheels count="4" /> <doors count="4" /> <color>Blue</color> </car>
    This type of problem seems ready for XML especially if other levels are added later like make, model, year, standard, custom, etc. Also you will keep the description of the multiple attributes in your schema allowing the data to be parsed by more than just one specialised app.
    "No matter where you go, there you are." BB
      Why not XML

      Lots of reasons.

      • He'd have to encode it and decode it.
      • He's probably not sending information between machines or processes yet. He may never do that.
      • You can add data to hashes too.
      • It's more complex to retrieve data from XML.
      • You'll have to add XML processing modules to the script. That's fine if you need it, but why?

      Perl has built-in data structures that do the job perfectly well; using XML would mean that a module has to translate XML to and from those very same data structures. Why not skip a step?

      Update: Re-reading the question, I can understand reading it as asking for a storage mechanism. XML's not bad in that case. I prefer YAML, but XML's workable.

        Why Not do both XML and HOH? Try: Config::General Write your HOH as XML, and with a few lines:
        use Config::General; my %CF = ParseConfig("./config.dat");

        turn it into HOH in your code.

        The module lets you write out hashes as files too.

        Nothing is too wonderful to be true
        -- Michael Faraday


        This was in response to chromatic above. User error hit reply to the wrong node.

        I was going to reply until I saw your update and then didn't and now I am again anyway. My point was soley based on a storage mechanism. I agree with all your points and one of the problems I have with XML is the complexity of retrieving data from it; though good schemas and document control can make this less of a task.

        I think XML as a storage mechanism has been a hard road for me personally, but having to work with varied systems recently I have found that the intial design and implementation hurdles are worth the extra effort. You will always find the data you used in an application today required by a different application tomorrow, and XML is the closest I have seen for a "Universal Adapter" or whatever the funky sputnik looking orb is in the IBM advert.

        "No matter where you go, there you are." BB
      This is a particularly good solution if you require persistence. But then, depending on the amount of data and number of objects to be described, you may wish to go with a relational database.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://356253]
Approved by Limbic~Region
Front-paged by Limbic~Region
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (1)
As of 2024-05-24 00:35 GMT
Find Nodes?
    Voting Booth?

    No recent polls found