Ask a Question related to Ruby, Design and Development.
-
Gavin Sinclair #1
Proposal: Array#to_h, to simplify hash generation
Hi -talk,
Ruby has wonderful support for chewing and spitting arrays. For
instance, it's easy to produce an array from any Enumerable using
#map. With hashes, however, it's a bit more cumbersome.
For example, the following method is typical of my code:
# return { filename -> size }
def get_local_gz_files
files = {}
Dir["*.gz"].each do |filename|
files[filename] = File.stat(filename).size
end
files
end
The pattern is: create an empty hash, populate it, and return it. Now
Ruby is a wonderfully expressive and terse language. Accordingly, the
two lines devoted to initialising and returning the hash in the above
code seem wasted.
If Ruby had Array#to_h, then I could rewrite it as:
# return { filename -> size }
def get_local_gz_files
Dir["*.gz"].map { |filename|
[ filename, File.stat(filename).size ]
}.to_h
end
The proposed implementation of Array#to_h is per the following code:
class Array
def to_h
hash = {}
self.each do |elt|
raise TypeError unless elt.is_a? Array
key, value = elt[0..1]
hash[key] = value
end
hash
end
end
For the final justification, note that this is the logical reverse of
Hash#to_a:
h = {:x => 5, :y => 10, :z => -1 }
a = h.to_a # => [[:z, -1], [:x, 5], [:y, 10]]
# And now, for my next trick...
a.to_h == h # => true (gosh, that actually worked)
Thoughts?
Gavin
Gavin Sinclair Guest
-
hash of hash of array slices
This works Foreach ( @{$hash{$key1}{$key2}} ) This does note Foreach ( @{($hash{$key1}{$key2})} ) This gives me this error .... Can't... -
[PHP-DEV] EOT (was [PHP-DEV] Proposal: Array syntax)
On Thu, 6 Nov 2003, Andi Gutmans wrote: If there was anything constructive in that long thread of "I like it" -- "no, I don't!" I might agree... -
[PHP-DEV] EOT (was [PHP-DEV] Proposal: Array syntax)
Sascha, I don't think it's a private matter. Feel free to delete the emails with this subject when they come in. Andi At 01:30 PM 11/6/2003... -
Re[2]: [PHP-DEV] Proposal: Array syntax
Hi The problem i see when using array() (or list()) is that it nearly looks like it is a function, but it isn't. Using instead would clearly... -
hash generation question
Stephan wrote: How's this related to the modules list? Anyhow, what about this: my $hashref = \%hash; $hashref = $hashref->{$_} foreach... -
ts #2
Re: Proposal: Array#to_h, to simplify hash generation
>>>>> "G" == Gavin Sinclair <gsinclair@soyabean.com.au> writes:
G> def get_local_gz_files
G> files = {}
G> Dir["*.gz"].each do |filename|
G> files[filename] = File.stat(filename).size
G> end
G> files
G> end
svg% cat b.rb
#!/usr/bin/ruby
def get_local_c_files
Hash[*Dir["*.c"].map do |filename|
[filename, File.stat(filename).size]
end.flatten]
end
p get_local_c_files
svg%
svg% b.rb
{"st.c"=>10714, "range.c"=>10706, "enum.c"=>11250, "util.c"=>22676,
"sprintf.c"=>12332, "re.c"=>38877, "version.c"=>1094, "random.c"=>6485,
"object.c"=>34530, "class.c"=>17870, "main.c"=>988, "compar.c"=>2720,
"array.c"=>43170, "process.c"=>30792, "io.c"=>82748, "dln.c"=>39614,
"variable.c"=>35056, "time.c"=>32796, "string.c"=>69845, "regex.c"=>123352,
"numeric.c"=>36979, "inits.c"=>1765, "dmyext.c"=>20, "dir.c"=>21761,
"signal.c"=>13318, "pack.c"=>39965, "math.c"=>6199, "hash.c"=>39087,
"error.c"=>25114, "parse.c"=>348857, "ruby.c"=>22725, "marshal.c"=>27620,
"lex.c"=>4480, "bignum.c"=>34051, "struct.c"=>15141, "prec.c"=>1677,
"gc.c"=>34935, "file.c"=>58392, "eval.c"=>219839}
svg%
Guy Decoux
ts Guest
-
Brian Candler #3
Re: Proposal: Array#to_h, to simplify hash generation
On Sat, Jul 19, 2003 at 11:22:20PM +0900, Gavin Sinclair wrote:
It does, almost:> If Ruby had Array#to_h, then I could rewrite it as:
irb(main):001:0> a = ["cat","one","dog","two"]
=> ["cat", "one", "dog", "two"]
irb(main):002:0> Hash[*a]
=> {"cat"=>"one", "dog"=>"two"}
I don't remember seeing an exact inverse of Hash#to_a though, i.e. one which
converts [[a,b],[c,d]] to {a=>b, c=>d}
You can always 'flatten' your array, as long as the elements of the hash
you're creating aren't themselves arrays.
Regards,
Brian.
Brian Candler Guest
-
Yukihiro Matsumoto #4
Re: Proposal: Array#to_h, to simplify hash generation
Hi,
In message "Proposal: Array#to_h, to simplify hash generation"
on 03/07/19, Gavin Sinclair <gsinclair@soyabean.com.au> writes:
|If Ruby had Array#to_h, then I could rewrite it as:
|
| # return { filename -> size }
| def get_local_gz_files
| Dir["*.gz"].map { |filename|
| [ filename, File.stat(filename).size ]
| }.to_h
| end
It has been proposed several times. The issues are
* whether the name "to_h" is a good name or not. somebody came up
with the name "hashify". I'm not excited by both names.
* what if the original array is not an assoc array (array of arrays
of two elements). raise error? ignore?
matz.
Yukihiro Matsumoto Guest
-
Gavin Sinclair #5
Re: Proposal: Array#to_h, to simplify hash generation
On Sunday, July 20, 2003, 1:31:42 AM, Yukihiro wrote:
> Hi,> In message "Proposal: Array#to_h, to simplify hash generation"
> on 03/07/19, Gavin Sinclair <gsinclair@soyabean.com.au> writes:> |If Ruby had Array#to_h, then I could rewrite it as:
> |
> | # return { filename -> size }
> | def get_local_gz_files
> | Dir["*.gz"].map { |filename|
> | [ filename, File.stat(filename).size ]
> | }.to_h
> | endI thought it sounded familiar, but didn't see an RCR.> It has been proposed several times.
> The issues are#to_h sounds good to me - we already have to_s, to_a, to_i, etc. It's> * whether the name "to_h" is a good name or not. somebody came up
> with the name "hashify". I'm not excited by both names.
just too sweet that Hash#to_a and Array#to_h should be the inverse of
each other.
What don't you like about #to_h?
#to_hash is fine by me too, but I don't really know the nuances of
to_s/to_str, to_a/to_ary, ...
Raise error. #to_h is clearly a method to be used with care. People> * what if the original array is not an assoc array (array of arrays
> of two elements). raise error? ignore?
are unlikely to call it on random objects. Of course, [1,2,3,4].to_h
could be the equivalent to Hash[1,2,3,4]. But then there's the corner
case: [ [1,2], "x", [7,8], "g" ].to_h.
I think I would insist on the input being an assoc array.
Gavin
Gavin Sinclair Guest
-
Yukihiro Matsumoto #6
Re: Proposal: Array#to_h, to simplify hash generation
Hi,
In message "Re: Proposal: Array#to_h, to simplify hash generation"
on 03/07/20, Gavin Sinclair <gsinclair@soyabean.com.au> writes:
|I thought it sounded familiar, but didn't see an RCR.
I don't remember the RCR number. Search for "hashify".
|What don't you like about #to_h?
I just didn't feel we had consensus. Besides, "to_h" you've proposed
work for arrays with specific structure (assoc like).
|#to_hash is fine by me too, but I don't really know the nuances of
|to_s/to_str, to_a/to_ary, ...
Longer versions are for implicit conversion. An object that has
"to_str" works like a string if it's given as an argument.
Note we have "to_hash" already. But this would not be the reason for
"to_h". We have "to_io" without the shorter version, for example.
|> * what if the original array is not an assoc array (array of arrays
|> of two elements). raise error? ignore?
|
|Raise error. #to_h is clearly a method to be used with care. People
|are unlikely to call it on random objects. Of course, [1,2,3,4].to_h
|could be the equivalent to Hash[1,2,3,4]. But then there's the corner
|case: [ [1,2], "x", [7,8], "g" ].to_h.
|
|I think I would insist on the input being an assoc array.
TypeError? or ArgumentError?
I just remembered that I thought Hash[ary] might be the better
solution. I'm not sure why I didn't implement it. I have very loose
memory.
matz.
Yukihiro Matsumoto Guest
-
Gavin Sinclair #7
Re: Proposal: Array#to_h, to simplify hash generation
On Sunday, July 20, 2003, 2:56:08 AM, Yukihiro wrote:
> Hi,> In message "Re: Proposal: Array#to_h, to simplify hash generation"
> on 03/07/20, Gavin Sinclair <gsinclair@soyabean.com.au> writes:> |I thought it sounded familiar, but didn't see an RCR.It's #12. Interesting: I like the #hashify idea better than my proposal.> I don't remember the RCR number. Search for "hashify".
My original code could be written
# return { filename -> size }
def get_local_gz_files
Dir["*.gz"].to_hash { |filename| File.stat(filename).size }
end
That does away with the intermediate assoc array, and is overall very
elegant. Best of all, it can be used with any Enumerable type, and it
doesn't have any requirement on the structure of the receiver.
module Enumerable
def to_hash
result = {}
each do |elt|
result[elt] = yield(elt)
end
result
end
end
That is capturing the very idiom I have repeated so many times.
Alternatives to #to_hash are:
hashify (the original and the worst :)
map_hash
hash_map (it is, after all, mapping a collection into a hash)
I think I like "map_hash" the best.
["cat", "dog", "mouse"].map { |s| s.length }
# -> [3, 3, 5]
["cat", "dog", "mouse"].map_hash { |s| s.length }
# -> {"cat"=>3, "mouse"=>5, "dog"=>3}
Gavin
Gavin Sinclair Guest
-
Kurt M. Dresner #8
Re: Proposal: Array#to_h, to simplify hash generation
> I just didn't feel we had consensus. Besides, "to_h" you've proposed
Far be it from me to say anything of much value, but I definitely think> work for arrays with specific structure (assoc like).
that an instance function of Class Array should have a defined behavior
for all Arrays. Is there any argument to the contrary?
-Kurt
Kurt M. Dresner Guest
-
Kurt M. Dresner #9
Re: Array and Hash to_s
> The main problem here is that Array#to_s calls join with the default
It's intuitive because it's the opposite of taking a string and putting> field separator, which for some reason is "". To me, this isn't
> intuitive. Is there some historical reason why this behavior exists?
> Even less intuitive to me is Hash#to_s, because the way the conversion
> is done you lose any concept it was a hash.
each character as an element of an array.
"foobar" -> ['f','o','o','b','a','r'] -> "foobar"
If you want a different .to_s you can just join with something else.
It's pretty easy to just do foobararray.join(',') if you want
"f,o,o,b,a,r", and additionally it's a little easier to read.
-Kurt
Kurt M. Dresner Guest
-
Martin DeMello #10
Re: Proposal: Array#to_h, to simplify hash generation
Gavin Sinclair <gsinclair@soyabean.com.au> wrote:
And we already have Array methods that assume an associative array.> Raise error. #to_h is clearly a method to be used with care. People
> are unlikely to call it on random objects. Of course, [1,2,3,4].to_h
> could be the equivalent to Hash[1,2,3,4]. But then there's the corner
> case: [ [1,2], "x", [7,8], "g" ].to_h.
>
> I think I would insist on the input being an assoc array.
m.
Martin DeMello Guest
-
Tim Hunter #11
Re: Proposal: Array#to_h, to simplify hash generation
On Sun, 20 Jul 2003 03:20:50 +0900, Kurt M. Dresner wrote:
pack, assoc, and rassoc>>> I just didn't feel we had consensus. Besides, "to_h" you've proposed
>> work for arrays with specific structure (assoc like).
> Far be it from me to say anything of much value, but I definitely think
> that an instance function of Class Array should have a defined behavior
> for all Arrays. Is there any argument to the contrary?
>
> -Kurt
Tim Hunter Guest
-
Yukihiro Matsumoto #12
Re: Proposal: Array#to_h, to simplify hash generation
Hi,
In message "Re: Proposal: Array#to_h, to simplify hash generation"
on 03/07/20, Martin DeMello <martindemello@yahoo.com> writes:
|> I think I would insist on the input being an assoc array.
|
|And we already have Array methods that assume an associative array.
I think you mean assoc and rassoc. But they are look-up methods. No
harm would happen for non assoc input for them. I feel like Hash
creation is little bit different.
matz.
Yukihiro Matsumoto Guest
-
Martin DeMello #13
Re: Proposal: Array#to_h, to simplify hash generation
Yukihiro Matsumoto <matz@ruby-lang.org> wrote:
Actually, I've always felt those were out of place in Array too. And if> In message "Re: Proposal: Array#to_h, to simplify hash generation"
> on 03/07/20, Martin DeMello <martindemello@yahoo.com> writes:
> |
> |And we already have Array methods that assume an associative array.
>
> I think you mean assoc and rassoc. But they are look-up methods. No
> harm would happen for non assoc input for them. I feel like Hash
> creation is little bit different.
they were factored out into an AssocArray mixin, we could conveniently
put hashify there.
martin
Martin DeMello Guest
-
dblack@superlink.net #14
Re: Proposal: Array#to_h, to simplify hash generation
Hi --
On Mon, 21 Jul 2003, Martin DeMello wrote:
But the special case of converting an associative array to a hash is> Yukihiro Matsumoto <matz@ruby-lang.org> wrote:>> > In message "Re: Proposal: Array#to_h, to simplify hash generation"
> > on 03/07/20, Martin DeMello <martindemello@yahoo.com> writes:
> > |
> > |And we already have Array methods that assume an associative array.
> >
> > I think you mean assoc and rassoc. But they are look-up methods. No
> > harm would happen for non assoc input for them. I feel like Hash
> > creation is little bit different.
> Actually, I've always felt those were out of place in Array too. And if
> they were factored out into an AssocArray mixin, we could conveniently
> put hashify there.
different from the "classic" (in terms of volume of ruby-talk devoted
to it, and how long we've been discussing it :-) array-to-hash
conversion, as per RCR #12 and its definition of "hashify" (a term I
proposed reluctantly, knowing people would hate it :-) but it seemed
the most accurate for what I was describing). Modularization is a
good idea, though, particularly for the various home-grown
[{to_(h}ash]ify) variants in circulation, though organizing that kind
of thing community-wide is something I've never figured out how to do.
David
--
David Alan Black
home: [email]dblack@superlink.net[/email]
work: [email]blackdav@shu.edu[/email]
Web: [url]http://pirate.shu.edu/~blackdav[/url]
dblack@superlink.net Guest
-
dblack@superlink.net #15
Re: Proposal: Array#to_h, to simplify hash generation
Hi --
On Mon, 21 Jul 2003, Gavin Sinclair wrote:
(Wouldn't you have to wrap your two return values in an array to get> I like "make_hash". We would get more flexibility if "make_hash" insisted
> on receiving two values for the block: one for key and one for value. In
> one instance recently, I wanted to map the "filename" part of a data
> object to the object itself. This, I think, is readable:
> map = receipts.make_hash { |r| r.filename, r }
>
> Whereas in my pet case of mapping filename to size, we have
>
> map = filenames.make_hash { |fn| fn, File.stat(fn).size }
>
> And your example comes out as
>
> (1..10).make_hash { |i| i, f(i) }
the above to parse?)
I'm not sure how this differs from (rejected) RCR#12 (except for> I've raised an RCR for this (#148).
having to return a key as well as a value).
David
--
David Alan Black
home: [email]dblack@superlink.net[/email]
work: [email]blackdav@shu.edu[/email]
Web: [url]http://pirate.shu.edu/~blackdav[/url]
dblack@superlink.net Guest
-
Martin DeMello #16
Re: Proposal: Array#to_h, to simplify hash generation
Gavin Sinclair <gsinclair@soyabean.com.au> wrote:
It'd be nice if => were merely an alias for , so that we could say>
> I like "make_hash". We would get more flexibility if "make_hash" insisted
> on receiving two values for the block: one for key and one for value. In
> one instance recently, I wanted to map the "filename" part of a data
> object to the object itself. This, I think, is readable:
>
> map = receipts.make_hash { |r| r.filename, r }
.make_hash {|r| r.filename => r}
with two arguments, there's always the risk of confusing them.
Perhaps .make_hash {|r| {r.filename => r}}, where it updates the hash
with the anon hash. Inefficient, though, and it has the ugly {{ }}.
Yeah, I forget that from time to time.> (I don't know why you put the asterix there; Ranges are Enumerable.)
martin
Martin DeMello Guest
-
Gavin Sinclair #17
Re: Proposal: Array#to_h, to simplify hash generation
On Monday, July 21, 2003, 8:18:50 PM, dblack wrote:
> Hi --> On Mon, 21 Jul 2003, Gavin Sinclair wrote:>> I like "make_hash". We would get more flexibility if "make_hash" insisted
>> on receiving two values for the block: one for key and one for value. In
>> one instance recently, I wanted to map the "filename" part of a data
>> object to the object itself. This, I think, is readable:
>> map = receipts.make_hash { |r| r.filename, r }
>>
>> Whereas in my pet case of mapping filename to size, we have
>>
>> map = filenames.make_hash { |fn| fn, File.stat(fn).size }
>>
>> And your example comes out as
>>
>> (1..10).make_hash { |i| i, f(i) }I was kinda hoping not, but so be it. The thin veneer of presenting> (Wouldn't you have to wrap your two return values in an array to get
> the above to parse?)
tested code vanishes before everyone's eyes. I was surprised to
discover today that code (1) below works, but not code (2).
(1) def foo; 2,4; end
(2) def foo; return 2,4; end
>> I've raised an RCR for this (#148).How did O.J. Simpson's second trial differ from his first? ;)> I'm not sure how this differs from (rejected) RCR#12 (except for
> having to return a key as well as a value).
Anyway, I think returning a key as well as a value is a significant
difference:
- much more flexible (I create all kinds of hashes all the time in
my code, and could really use that flexibility)
- less magical, more scrutible: having two values makes it clear what
is going on, given that we're dealing with a hash. With the
single-value to_hash/hashify, I had to keep reminding myself what
it meant; not so with the new "make_hash".
Gavin
Gavin Sinclair Guest
-
Jason Creighton #18
Re: Proposal: Array#to_h, to simplify hash generation
On Mon, 21 Jul 2003 05:32:25 GMT
Martin DeMello <martindemello@yahoo.com> wrote:
I'd like the code to be something like this:> To me, 'hashify' implies taking an assoc array and converting it to hash
> form (or perhaps the perl-influenced [a, b, c, d] -> {a=>b, c=>d}). I
> still can't think of a name for the useful case :) make_hash, perhaps ..
>
> *(1..10).make_hash {|i| f(i)}
>
> or maybe the complementary hash_to and hash_from, where the block is
> respectively the value and the key for the corresponding array entry :)
module Enumerable
def to_h
h = Hash.new
if block_given?
self.each { |e| h[e] = yield(e) }
else
self.each { |key, value| h[key] = value }
end
return h
end
end
=> {5=>25, 1=>1, 2=>4, 3=>9, 4=>16}>> (1..5).to_h { |n| n*n }=> {1=>2, 3=>4}>> [ [1,2], [3,4] ].to_h=> [[1, 2], [3, 4]]>> [ [1,2], [3,4] ].to_h.to_a=> {1=>2, 3=>4}>> [ [1,2], [3,4] ].to_h.to_a.to_h=> {"/bin/dnsdomainname"=>9332, "/bin/date"=>25728, "/bin/dd"=>29492, "/bin/dmesg"=>3924, "/bin/df"=>27368, "/bin/domainname"=>9332}>> Dir["/bin/d*"].to_h { |f| File.size(f) }
I would almost prefer [1,2,3,4].to_h => {1=>2, 3=>4}, but Hash#to_h returns a
nested array, so that's what this code does. Plus it's easier to implement. :-)
Jason Creighton
Jason Creighton Guest
-
ts #19
Re: Proposal: Array#to_h, to simplify hash generation
>>>>> "G" == Gavin Sinclair <gsinclair@soyabean.com.au> writes:
G> tested code vanishes before everyone's eyes. I was surprised to
G> discover today that code (1) below works, but not code (2).
G> (1) def foo; 2,4; end
G> (2) def foo; return 2,4; end
Well, you want to say say (2) work but not (1), no ?
Guy Decoux
ts Guest
-
Brian Candler #20
Re: Proposal: Array#to_h, to simplify hash generation
On Mon, Jul 21, 2003 at 11:34:39PM +0900, Gavin Sinclair wrote:
Can I suggest another name - "collect_hash" - since that's basically what it>> > I'm not sure how this differs from (rejected) RCR#12 (except for
> > having to return a key as well as a value).
> How did O.J. Simpson's second trial differ from his first? ;)
>
> Anyway, I think returning a key as well as a value is a significant
> difference:
> - much more flexible (I create all kinds of hashes all the time in
> my code, and could really use that flexibility)
> - less magical, more scrutible: having two values makes it clear what
> is going on, given that we're dealing with a hash. With the
> single-value to_hash/hashify, I had to keep reminding myself what
> it meant; not so with the new "make_hash".
is?
collect {... return [x,y] } =>> [[a,b], [c,d], ...]
collect_hash {... return [x,y] } =>> {a=>b, c=>d, ...}
In which case it clearly belongs in Enumerable - see Gavin's implementation
in [RubyTalk:76446]
It's still not an inverse operation to Hash#to_a, and I think there could
still be value in that. You could simulate it of course, using
myhash = myarray.collect_hash { |pair| pair }
If we didn't have Hash#to_a then it could also be implemented as
myarray = myhash.collect { |pair| pair }
But we do, so we don't bother.
Regards,
Brian.
Brian Candler Guest



Reply With Quote

