[Time] Name | Message |
[09:05] sejo
|
is there a way to send a message to a que and have mutiple subscribers, but only one handles the message?
|
[09:09] guido_g
|
with pub/sub not out of the box
|
[09:09] guido_g
|
check the other socket types like push/pull
|
[09:13] pieter_hintjens
|
g'morning
|
[09:13] guido_g
|
howdy pieter_hintjens
|
[09:13] guido_g
|
saw a new spec comming up
|
[09:13] pieter_hintjens
|
:-)
|
[09:14] guido_g
|
atm fighting w/ github and documentation
|
[09:14] guido_g
|
see http://guidog.github.com/pyzmq-mdp/
|
[09:14] pieter_hintjens
|
hey, nice
|
[09:14] pieter_hintjens
|
how do you get the subdomain?
|
[09:15] guido_g
|
it's the standard githup page
|
[09:15] pieter_hintjens
|
hmm
|
[09:15] guido_g
|
go to a repo -> admin -< check GitHup Pages
|
[09:16] pieter_hintjens
|
yes, looking at it...
|
[09:16] pieter_hintjens
|
nah, forget it, wasted too many years writing HTML already
|
[09:16] pieter_hintjens
|
this is why I like wikidot
|
[09:16] guido_g
|
i'm using Sphinx for documentation
|
[09:17] guido_g
|
just need to fix the css refs and it should work
|
[09:36] mikko
|
good morning
|
[09:37] Steve-o
|
afternoon
|
[09:37] Steve-o
|
just installing msvc x64 again to test out win64 builds
|
[09:39] mikko
|
Steve-o: i was thinking about that as well
|
[09:39] mikko
|
msvc64 builds
|
[09:40] mikko
|
i am investigating using amazon EC2 for linux64/win64
|
[09:40] mikko
|
if win64 works ok i can add daily builds for openpgm as well if you want
|
[09:41] Steve-o
|
i need to test masm64
|
[09:41] Steve-o
|
I doubt mingw-w64 is happy with that
|
[09:43] Steve-o
|
I complete forgot about lack of inline assembler
|
[09:44] pieter_hintjens
|
Steve-o: afternoon... :-)
|
[09:44] Steve-o
|
so difficult to work with 8 & 16-bit atomics
|
[09:44] pieter_hintjens
|
I forgot to ask you whether there were patches to go into 2.1.2 or not...
|
[09:45] Steve-o
|
I don't know whether you wish to stay on 5.0 or not
|
[09:45] pieter_hintjens
|
I have no opinion on that...
|
[09:45] Steve-o
|
5.0.93 was out in December as a large backport
|
[09:45] Steve-o
|
mainly Win64 stuff, tedious
|
[09:45] Steve-o
|
but someone else posted patches for 5.1.99 for OSX
|
[09:46] Steve-o
|
The 5.1 branch brings in OSX support
|
[09:46] Steve-o
|
forward ported from 3.0
|
[09:46] pieter_hintjens
|
do you think it's doable to bring that into the 2.1.x branch of 0MQ?
|
[09:46] pieter_hintjens
|
all I want is people who download the stable 0MQ to get OpenPGM working with it
|
[09:47] Steve-o
|
The API is very similar, only the definition of restrict changed
|
[09:48] Steve-o
|
I'm only testing Windows with the 5.1 branch
|
[09:48] pieter_hintjens
|
Steve-o: are you testing OpenPGM together with 0MQ?
|
[09:48] Steve-o
|
Mikko's autoconf changes will be really useful though
|
[09:49] pieter_hintjens
|
mikko: can you shine a light here?
|
[09:49] mikko
|
pieter_hintjens: sure
|
[09:49] mikko
|
pieter_hintjens: we've been working on adding autotools build to openpgm
|
[09:50] mikko
|
and we would invoke that build from zeromq build
|
[09:50] pieterh
|
it's not yet in the zeromq build, then
|
[09:50] Steve-o
|
5.1 adds cmake and autoconf build systems for release
|
[09:50] mikko
|
pieterh: i have a branch where this works pretty well
|
[09:50] pieterh
|
mikko: branch off what?
|
[09:50] mikko
|
pieterh: zeromq2
|
[09:50] pieterh
|
what version
|
[09:50] pieterh
|
master?
|
[09:50] mikko
|
master
|
[09:51] pieterh
|
ok, close enough...
|
[09:51] mikko
|
but the changes are easy to port to any branch
|
[09:51] mikko
|
linux and solaris are now pretty well tested
|
[09:51] pieterh
|
do you want to try to get this into 2.1.13?
|
[09:51] Steve-o
|
and with that I want to test the outstanding github issues again
|
[09:52] pieterh
|
it can go into a later stable version too but sooner seems better
|
[09:52] pieterh
|
sorry... 2.1.3!
|
[09:52] mikko
|
well, we can work towards that goal
|
[09:53] mikko
|
most of the work we've been doing doesn't have visibility in the zeromq community
|
[09:53] mikko
|
as the changes tend to go to openpgm trunk
|
[09:53] pieterh
|
indeed, it's been submarinish
|
[09:53] mikko
|
and this hopefully helps openpgm as well
|
[09:54] Steve-o
|
apart from I"m looking after three build systems now :O
|
[09:54] mikko
|
as distros like to package autotools builds
|
[09:54] pieterh
|
oh, joy... :-)
|
[09:54] pieterh
|
ok, mikko, let's see if we can backport these changes to the 2.1 branch sometime
|
[09:54] pieterh
|
i'm happy to help of course
|
[09:55] mikko
|
Steve-o: what is the situation with freebsd?
|
[09:55] mikko
|
openpgm + freebsd?
|
[09:55] Steve-o
|
I use freebsd as part of my network testing suite
|
[09:55] Steve-o
|
but not as a target platform
|
[09:55] Steve-o
|
if that makes sense
|
[09:56] Steve-o
|
I can build and monitor traffic from Solaris, Windows, and Linux perfectly
|
[09:57] mikko
|
is there anything i can help with the builds?
|
[09:57] Steve-o
|
but I have problems building the rest of the test suite on it, the FreeBSD networking stack is a bit too different for how the tests are written
|
[09:58] Steve-o
|
you can try building with the unit tests, -DWITH_CHECK=true and -DWITH_TEST=true in scons
|
[09:58] Steve-o
|
its just a lot of small differences everywhere
|
[09:59] mikko
|
ok
|
[09:59] Steve-o
|
annoying as Sun works fine
|
[09:59] mikko
|
what about mac os x?
|
[09:59] Steve-o
|
but no one is paying for it though, so meh
|
[09:59] Steve-o
|
My desktop is a max mini, so occasionally I boot in and test
|
[10:00] Steve-o
|
I haven't tried the test suite, basic functionality seems ok
|
[10:00] engwan
|
Hi, I'm stumped with just a segfault and no other info
|
[10:00] engwan
|
what might have gone wrong?
|
[10:00] engwan
|
I'm using the ruby ffi client
|
[10:00] engwan
|
and traced the segfault to this call
|
[10:00] engwan
|
msg = @reqs.recv_string(0)
|
[10:00] mikko
|
cremes: here?
|
[10:00] mikko
|
andrewvc: here?
|
[10:01] mikko
|
those two are the rubyists
|
[10:01] andrewvc
|
mikko: yo
|
[10:01] mikko
|
as far as i know
|
[10:01] andrewvc
|
engwan: which ruby/ZMQ version are you using
|
[10:02] engwan
|
i tried zmq 2.0.8, 2.0.10 and 2.1.0
|
[10:02] engwan
|
none worked
|
[10:02] engwan
|
ruby 1.9.2
|
[10:02] andrewvc
|
hmmm
|
[10:02] andrewvc
|
it should work well with 1.9.2/HEAD
|
[10:02] andrewvc
|
generally, we work for compat with HEAD, not the released versions
|
[10:02] Steve-o
|
mikko: I cannot imagine many businesses wanting to use PGM on OSX though, I wait to be surprised
|
[10:02] andrewvc
|
however, it should work regardless
|
[10:02] engwan
|
im actually trying to run my rails app using rack-mongrel2
|
[10:02] andrewvc
|
what exactly are you trying to do in the code?
|
[10:03] mikko
|
Steve-o: only developers i guess
|
[10:03] engwan
|
and segfaults when i try to start the server
|
[10:03] Steve-o
|
mikko: Oddly enough I was the first to port TIBCO's Rendezvous to OS X
|
[10:03] engwan
|
i traced the segfault to line 23
|
[10:03] engwan
|
in https://github.com/darkhelmet/rack-mongrel2/blob/master/lib/mongrel2/connection.rb
|
[10:04] Steve-o
|
mikko: it made sales and architects happy to show off
|
[10:04] Steve-o
|
mikko: plus it took nearly 2 years for TIBCO to actually do it themselves
|
[10:05] mikko
|
hehehe
|
[10:05] Steve-o
|
mikko: OSX APIs are older than FreeBSD though which is really gnarly
|
[10:06] mikko
|
Steve-o: but it has a pretty gui!
|
[10:06] Steve-o
|
I'm under the impression Lion is getting a boost though
|
[10:07] Steve-o
|
and hopefully a tidy up for IPv6
|
[10:08] Steve-o
|
I still don't have a Java API yet because the platform is too old, when will Oracle push v7 out of the door?
|
[10:10] Steve-o
|
bindings are all pretty difficult to implement due to immaturity of IPv6 and SSM support :(
|
[10:13] Steve-o
|
I've tried Erlang, Java, Perl, and Python. All I managed to get working was a trivial Perl API.
|
[10:18] Guthur
|
IPv6 seems to taking along time to get ready, in general
|
[10:18] Guthur
|
considering it was need like yesterday
|
[10:19] Steve-o
|
The APIs are hilarious, they have changed like 3-4 times
|
[10:20] Steve-o
|
And clearly most developers are not going to get scopes or zone identifiers working well
|
[10:22] Steve-o
|
And Solaris, ugh, you cannot assign a static IPv6 address (!)
|
[10:23] Steve-o
|
Setup RIP and give everyone else on the network another address for fun
|
[10:27] Steve-o
|
Also, the PGM protocol is IP agnostic so you can end up with really insane environments hoping between families
|
[10:30] Guthur
|
Steve-o: sounds like fun times, hehe
|
[10:33] Steve-o
|
mmm, CMake not generating a Makefile for Win64, ...
|
[10:34] Guthur
|
pieterh: still not renamed majordomo yet?
|
[10:34] pieterh
|
Guthur: no intention, that's the name of the pattern
|
[10:34] pieterh
|
you mean because of the old mailing list software?
|
[10:34] Guthur
|
yep
|
[10:35] pieterh
|
nah
|
[10:35] Guthur
|
cool
|
[10:35] stimpie
|
is there a limit on the number of 'connections' a socket can have?
|
[10:35] Guthur
|
I just thought you were going to change it
|
[10:37] pieterh
|
Guthur: the software seems quite dead, sadly
|
[10:37] pieterh
|
stimpie: not in 0mq, but your OS will have limits
|
[10:39] andrewvc
|
engwan: I'm not familiar with that codebase
|
[10:39] andrewvc
|
does it use threading at all by chance?
|
[10:39] engwan
|
i actually figured out something
|
[10:39] andrewvc
|
oh?
|
[10:39] engwan
|
it segfaults when i use threading
|
[10:39] engwan
|
coz im using a custom loader
|
[10:39] engwan
|
that forks into separate threads
|
[10:40] engwan
|
i was trying to achieve this: load rails, then fork 3 processes
|
[10:40] engwan
|
each connecting to the same zeromq/mongrel
|
[10:40] andrewvc
|
yeah
|
[10:41] engwan
|
are there inherent problems with threading and zeromq?
|
[10:41] andrewvc
|
well, sharing sockets between threads isn't something you're supposed to do with zmq
|
[10:41] andrewvc
|
you can pass one between threads, but you may not concurrently access one, so far as I know
|
[10:41] andrewvc
|
at any rate, here's what I'd do
|
[10:41] andrewvc
|
have a single thread that handles all messages on a single socket
|
[10:41] andrewvc
|
and delegate to worker processes
|
[10:42] andrewvc
|
that, or spawn one socket per thread
|
[10:43] engwan
|
hmm.. but i tried just spawning one thread.. and it still segfaults
|
[10:44] andrewvc
|
hmmm, if the thread is different than the one that created the socket, you may still have issues
|
[10:44] engwan
|
my loader forks/execs then creates the socket
|
[10:44] engwan
|
then segfault
|
[10:44] andrewvc
|
ohhhhhhhhhh
|
[10:44] andrewvc
|
there's your problem
|
[10:44] engwan
|
but if i do it directly
|
[10:44] engwan
|
without fork
|
[10:44] engwan
|
it works
|
[10:44] andrewvc
|
yeah, exactly
|
[10:44] pieterh
|
sockets are not thread safe
|
[10:44] andrewvc
|
I've found that requiring ffi-rzmq before forking does not work
|
[10:44] andrewvc
|
so, don't require ffi-rzmq till after you've forked
|
[10:44] engwan
|
ohh...
|
[10:45] andrewvc
|
each process will need its own context anyway
|
[10:45] andrewvc
|
still, you should have one socket per thread
|
[10:45] andrewvc
|
or just have an IO dedicated thread that handles a single socket
|
[10:46] andrewvc
|
you can even use eventmachine for that with em-zeromq
|
[10:46] andrewvc
|
kinda like a zmq version of thin
|
[10:46] andrewvc
|
though, I with a low number of sockets there's not much benefit
|
[10:50] andrewvc
|
pieterh: does that advice make sense for you? Would you go one socket per worker, or just spawn a single thread reading off the sock and delegating to workers
|
[10:50] andrewvc
|
in a multi-threaded up
|
[10:50] pieterh
|
... typically, one TCP socket facing clients, and multiple inproc sockets sending work to/from workers
|
[10:51] pieterh
|
you need to know why you're using multiple threads
|
[10:51] andrewvc
|
well, it's engwan's thing
|
[10:51] andrewvc
|
a mongrel2 handler
|
[10:51] andrewvc
|
each thread has a full rails process
|
[10:52] pieterh
|
each thread is a handler?
|
[10:52] andrewvc
|
I tend to think inproc isn't so good in ruby, as there's a serialization cost. In C you can just stuff a struct in one, in ruby you have to serialize/deserialize
|
[10:52] andrewvc
|
yeah, each thread handles a separate HTTP request using rails
|
[10:52] engwan
|
I'm just creating an easier way to create lots of rails processes connecting to mongrel2
|
[10:52] engwan
|
instead of starting each one individually
|
[10:52] engwan
|
i thought of starting one then forking
|
[10:53] andrewvc
|
well, here's the other thing about having a master thread
|
[10:53] andrewvc
|
you can have a local-global request queue
|
[10:53] pieterh
|
engwan: ok, so give each thread its own context
|
[10:53] pieterh
|
its own sockets, let them connect independently back to mongrel2
|
[10:53] engwan
|
yup, thats what im doing
|
[10:53] andrewvc
|
i mean, as I understand it, ZMQ will round-robin requests to various workers
|
[10:53] andrewvc
|
so you want the workers pulling work, not getting it pushed to them, ideally
|
[10:53] engwan
|
you were right andrewvc regarding the require thing
|
[10:54] andrewvc
|
ah, cool, glad I could help
|
[10:54] engwan
|
yeah, im not pushing work to the workers, rather the workers themselves connect to mongrel2 and ask for work
|
[10:54] andrewvc
|
you could either use the router/dealer stuff for that, or the easy way would be just to use a ruby construct to pull the work
|
[10:54] pieterh
|
engwan: sounds fine, where are you creating the context?
|
[10:54] andrewvc
|
yeah, I'm not nuts about that part of mongrel2
|
[10:55] andrewvc
|
since it uses a PUSH socket
|
[10:55] engwan
|
i created the context after forking
|
[10:55] andrewvc
|
PUSH has no backpressure right pieter?
|
[10:55] pieterh
|
so as far as zeromq is concerned, this should all be safe
|
[10:55] andrewvc
|
so, a slow worker will get equal work, correct
|
[10:55] pieterh
|
andrewvc: nope, it's just fanout
|
[10:55] pieterh
|
i mean, yes, you're right
|
[10:55] andrewvc
|
yeah, so engwan, I think that's kind of a design deficiency in mongrel2
|
[10:56] andrewvc
|
in that a slow worker will queue messages, even while others are free
|
[10:56] andrewvc
|
you could mitigate this in your handler by implementing your own global queue
|
[10:56] andrewvc
|
off a single socket
|
[10:56] pieterh
|
what mongrel2 should use is the LRU queue pattern
|
[10:57] pieterh
|
but I think engwan's question is "why is it crashing?" :-)
|
[10:57] engwan
|
oh, so these workers listen to the socket and receive zeromq messages, plainly round robin?
|
[10:57] andrewvc
|
oh, the crashing part has been fixed
|
[10:57] andrewvc
|
it was related to forking
|
[10:57] pieterh
|
ah, ok, excellent
|
[10:57] pieterh
|
so engwan, read Chapter 3 of the Guide
|
[10:57] pieterh
|
(and 1 and 2)
|
[10:57] pieterh
|
and if you want a really balanced mongrel2 engine...
|
[10:57] andrewvc
|
pieterh: I agree about the LRU queue
|
[10:58] pieterh
|
one LRU queue that accepts everything from mongrel2 and then distributes that to workers on a LRU basis
|
[10:58] andrewvc
|
but mongrel 2 uses PUSH and SUB
|
[10:58] pieterh
|
np
|
[10:58] andrewvc
|
ah, I see, just inject it eh
|
[10:58] pieterh
|
proxy
|
[10:58] andrewvc
|
that'd be pretty cool actually
|
[10:58] engwan
|
ok, will read more about zeromq.. i read the mongrel2 docs but forgot to read the zeromq docs
|
[10:58] pieterh
|
0MQ is a construction kit
|
[10:58] engwan
|
:D
|
[10:58] andrewvc
|
that is cool about zmq, that you can fix a design flaw like that so easily
|
[10:58] pieterh
|
it's not a design flaw
|
[10:58] pieterh
|
remember that Zed made this before we'd documented any LRU patterns
|
[10:59] andrewvc
|
ah, yeah, sorry, that's correct
|
[10:59] pieterh
|
if workers are reasonably behaved, it's simple and fast since there's no chit chat
|
[10:59] andrewvc
|
yeah
|
[11:00] andrewvc
|
someone, not me, should write a LRU Queue middleware in some generic fashion for mongrel 2
|
[11:01] pieterh
|
<insert call for volunteers here> :-)
|
[11:01] andrewvc
|
lol
|
[11:01] engwan
|
lol
|
[11:02] guido_g
|
hehe
|
[11:03] guido_g
|
pieterh: http://guidog.github.com/pyzmq-mdp/ <- the link you might want to add to the implementions section or mdp project page
|
[11:03] andrewvc
|
pieterh: well, is it LRU you want really
|
[11:03] andrewvc
|
nm, I guess any router/dealer would work. the important thing is the clients don't queue messages
|
[11:04] pieterh
|
guido_g: done
|
[11:04] guido_g
|
thx
|
[11:05] pieterh
|
andrewvc: yes, LRU is the pattern that comes back again and again
|
[11:05] andrewvc
|
is that because it guarantees that our workers service one req at a time and leave queuing to the router
|
[11:06] andrewvc
|
just making sure I'm on the same page?
|
[11:06] pieterh
|
yes, it basically:
|
[11:06] pieterh
|
- allows synchronous workers
|
[11:06] pieterh
|
- only gives work to a worker that's explicitly ready
|
[11:06] pieterh
|
- queues requests in one place
|
[11:07] pieterh
|
all of the more sophisticated queuing broker designs we make are based on LRU
|
[11:07] andrewvc
|
hmm, is there zdevice for it yet?
|
[11:07] pieterh
|
sure
|
[11:07] pieterh
|
mdbroker example, and guido_g's Python implementation he just referred to
|
[11:07] pieterh
|
this is MDP, rather more than basic LRU
|
[11:08] pieterh
|
lruqueue example, etc.
|
[11:09] andrewvc
|
cool
|
[11:09] andrewvc
|
I'll be integrating this into my own work for sure. I really need to re-read the guide, there's more and more good stuff in there
|
[11:10] guido_g
|
i think over time it'll evolve into the standard book for using ømq
|
[11:10] andrewvc
|
yeah
|
[11:10] pieterh
|
guido_g: that's kind of the idea, yeah
|
[11:10] guido_g
|
wie might even see a dead tree edition -- will we?
|
[11:10] pieterh
|
yup
|
[11:10] pieterh
|
lulu.com eventually
|
[11:15] pieterh
|
ok, have to go, cyal
|
[11:16] guido_g
|
cu
|
[11:17] guido_g
|
oh, lunch time already
|
[13:16] yrashk
|
hey
|
[13:17] yrashk
|
out of curiosity, does 0mq work on android and/or iOS?
|
[13:22] Guthur
|
yrashk: I remember someone mentioning doing it
|
[13:22] yrashk
|
android or iOS?
|
[13:22] Guthur
|
sorry I can't remember any other details
|
[13:22] Guthur
|
yrashk: I can't remember tbh, it was some smartphone though
|
[13:23] yrashk
|
I see
|
[13:23] Guthur
|
sorry
|
[13:23] yrashk
|
thanks anyway :)
|
[13:23] Guthur
|
fire something onto the mailing list
|
[13:23] yrashk
|
not curious enough
|
[13:24] yrashk
|
was just chatting with a friend about a project he's working on for his employer
|
[13:25] Guthur
|
http://www.zeromq.org/distro:android
|
[13:26] yrashk
|
thanks!
|
[13:27] Guthur
|
http://www.mail-archive.com/zeromq-dev@lists.zeromq.org/msg05684.html
|
[13:27] Guthur
|
there was the mailing thread ^^^
|
[13:44] Guthur
|
pieterh: png?
|
[13:44] Guthur
|
ping*
|
[13:45] pieterh
|
Guthur: hi
|
[13:45] Guthur
|
the link to the new C# examples I sent is not bold
|
[13:45] Guthur
|
in the language list below the example
|
[13:45] pieterh
|
it should be fixed now, requires a rebuild of the docs
|
[13:46] Guthur
|
oh ok
|
[13:46] Guthur
|
oh yeah, sorry i needed to refresh
|
[13:46] Guthur
|
doh
|
[13:46] pieterh
|
so you're at 61% :-)
|
[13:46] Guthur
|
i'm on peering2 at the moment
|
[13:47] pieterh
|
oh, sorry... 54%...
|
[13:47] Guthur
|
oh, I thought I was 54% before I sent those
|
[13:47] pieterh
|
hang on, scoreboard seems wrong...
|
[13:47] pieterh
|
these pages are not produced dynamically
|
[13:51] Guthur
|
btw, can more than one socket bind to a named pipe?
|
[13:51] pieterh
|
named pipe, you mean ipc? nope
|
[13:51] Guthur
|
ie. IPC
|
[13:51] Guthur
|
how is peering2 working then
|
[13:52] pieterh
|
it's using different ipc endpoints but the names may be confusingly similar
|
[13:52] Guthur
|
// Prepare local frontend and backend void *localfe = zmq_socket (context, ZMQ_XREP); zmq_bind (localfe, "ipc://localfe.ipc"); void *localbe = zmq_socket (context, ZMQ_XREP); zmq_bind (localbe, "ipc://localbe.ipc");
|
[13:52] Guthur
|
arrrgh
|
[13:52] Guthur
|
sorry
|
[13:52] Guthur
|
those binds are causing an error when I try to run two instances
|
[13:53] pieterh
|
yeah, you need different peer names
|
[13:53] pieterh
|
peering2 you me, peering2 me you
|
[13:53] Guthur
|
but those are hardcoded strings
|
[13:54] pieterh
|
peering2 takes the peer name from the command line
|
[13:55] Guthur
|
but they are not used for the localfe and localbe
|
[13:55] Guthur
|
they are used by the cloud ones
|
[13:55] Guthur
|
am I suppose to change those hardcoded strings
|
[13:55] Guthur
|
I assume not
|
[14:00] private_meta
|
dammit... the first character of a message being 0 in the zmsg example causes SERIOUS problems for using any type of intelligent container
|
[14:03] pieterh
|
Guthur: sorry, was getting a coffee... let me check
|
[14:03] pieterh
|
heh, IMO peering2 just ignores the error
|
[14:04] pieterh
|
this used to use inproc and I switched to ipc
|
[14:05] pieterh
|
localfe and localbe can't be hardcoded on ipc as they are
|
[14:06] pieterh
|
I changed the client & worker threads to use their own contexts
|
[14:06] pieterh
|
I'll need to make more changes... thanks for finding this error
|
[14:15] Guthur
|
pieterh: not a bother
|
[14:15] pieterh
|
Guthur: ok, I've fixed peering2 and peering3
|
[14:16] Guthur
|
cheers, I'll update mine later
|
[14:16] pieterh
|
you can find the new versions in the git, they use unique endpoints and they use assertions properly on binds
|
[14:16] Guthur
|
cool
|
[14:16] Guthur
|
I want to get up to Ch4 asap
|
[14:16] pieterh
|
it's really great that you double-check my work like this
|
[14:16] pieterh
|
getting the examples right is sometimes delicate
|
[14:16] Guthur
|
pieterh: tbh it's only because I was implementing the example and wanted to test my own code
|
[14:17] Guthur
|
if I had had completed then it would have gone by unchecked
|
[14:17] pieterh
|
it's a treasure trove, the examples archive
|
[14:17] private_meta
|
hmm
|
[14:17] pieterh
|
private_meta: still having trouble with nulls?
|
[14:18] private_meta
|
I circumvented it
|
[14:18] private_meta
|
I have a different problem now
|
[14:18] pieterh
|
i had to hack around it too, in zmsg
|
[14:18] private_meta
|
When I debug the program, it works, when I run it normally, it fails
|
[14:18] pieterh
|
usually a timing issue... so don't debug it, use prints
|
[14:18] private_meta
|
In the zmsg test, when normally executing it without gdb, it doesn't receive the "Hello"
|
[14:18] pieterh
|
multiple threads, I assume
|
[14:18] private_meta
|
Well, it's the OTHER way around, when I use prints to debug it fails
|
[14:18] private_meta
|
no
|
[14:18] private_meta
|
one thread
|
[14:19] private_meta
|
your zmsg example, c++ified
|
[14:19] pieterh
|
which zmsg example? there are... a few...
|
[14:19] pieterh
|
ah, I mean, what test program
|
[14:19] private_meta
|
the one in the master branch, C folder, zmsg.h
|
[14:19] pieterh
|
the selftest method?
|
[14:19] private_meta
|
Yes
|
[14:20] pieterh
|
hmm, so it sends a message to itself and checks it read back the right stuff
|
[14:20] pieterh
|
my advice is (a) forget the case where it works, gdb, and use only the case where it breaks
|
[14:21] pieterh
|
and (b) put printfs in the recv method to see what you're actually getting back, if anything
|
[14:21] private_meta
|
damn, i just removed the print statements because it DID work in debug
|
[14:23] private_meta
|
If I put a sleep(1) before the receive method, it works
|
[14:23] pieterh
|
you're using the same ipc:// transport?
|
[14:24] pieterh
|
0MQ has some of these weird timing issues in various places
|
[14:24] pieterh
|
I'll have to document them, there are a bunch of them
|
[14:25] private_meta
|
pieterh: It's pretty much a translation, I didn't change anything important in it's meaning
|
[14:25] pieterh
|
private_meta: ok, first thing is to forget the case where it works, and identify which timing issue it is
|
[14:26] pieterh
|
move the sleep around... after the bind, before connect
|
[14:26] pieterh
|
I can't see what it'd be, right away
|
[14:27] private_meta
|
pieterh: It HAS to be between send and receive
|
[14:28] pieterh
|
could be between bind and connect
|
[14:28] pieterh
|
it's one thread, there are only two async things happening
|
[14:28] private_meta
|
no, I mean, if the sleep isn't between send and receive, it won't work
|
[14:29] pieterh
|
are you doing a non-blocking recv?
|
[14:29] pieterh
|
recv should block forever until there's input
|
[14:29] private_meta
|
Apparently it receives an empty message
|
[14:29] private_meta
|
Ok, wait a second
|
[14:29] private_meta
|
I'd like to put it on gist, but git lost my logon information
|
[14:30] pieterh
|
silly git...
|
[14:31] pieterh
|
:-)
|
[14:32] private_meta
|
ok http://codepad.org/7VjMTxTR this is the code
|
[14:32] private_meta
|
http://codepad.org/rlTpafqH This is the output
|
[14:32] private_meta
|
The Message, part 2, should read "[005] Hello", but it is empty
|
[14:33] private_meta
|
http://codepad.org/ucswZvfR output if sleep between zm.send() and zm.recv() is in place
|
[14:34] yrashk
|
pieterh: speaking of ezmq/erlzmq, if we'll decide to rename, would you consider creating zeromq/erlzmq2 repo?
|
[14:35] pieterh
|
yrashk: of course
|
[14:35] yrashk
|
great
|
[14:35] pieterh
|
private_meta: weird, so it gets a couple of parts correctly and not the remainder...
|
[14:36] yrashk
|
Dhammika already +1'd this renaming
|
[14:36] yrashk
|
I am waiting for others to vote
|
[14:36] private_meta
|
pieterh: It even gets the size of the message right
|
[14:37] private_meta
|
I've put in a line to output content and size of the message
|
[14:37] private_meta
|
recv: "", size 17
|
[14:37] private_meta
|
recv: "", size 5
|
[14:37] private_meta
|
I get this
|
[14:37] pieterh
|
private_meta: hang on, there message output is strange
|
[14:37] private_meta
|
For the first message that's ok, as the "0" prevents a normal output
|
[14:37] pieterh
|
*the
|
[14:37] private_meta
|
The second message should show "Hello"
|
[14:37] private_meta
|
what's strange in the output?
|
[14:38] pieterh
|
well, we send 1 part, "hello" on an XREQ socket
|
[14:38] pieterh
|
and we receive on an XREP socket, which should prepend the identity
|
[14:38] pieterh
|
so the message should be two frames, identity and then hello
|
[14:38] pieterh
|
http://codepad.org/ucswZvfR is not correct
|
[14:39] private_meta
|
pieterh: where is it not correct?
|
[14:39] private_meta
|
Wait, let me change something
|
[14:39] pieterh
|
it should print two frames, identity + hello body
|
[14:40] pieterh
|
ok, this is for multiple messages, not just the first one
|
[14:40] pieterh
|
I think you're assembling the frames wrongly
|
[14:41] private_meta
|
I've made a new line between each dump now
|
[14:41] private_meta
|
http://codepad.org/PeaZD9GI
|
[14:41] pieterh
|
see http://codepad.org/nSsVMEhq
|
[14:41] private_meta
|
The first one is a dump of sendmessage
|
[14:41] private_meta
|
the second and third are received messages
|
[14:41] pieterh
|
ah, ok
|
[14:42] pieterh
|
so... in the case when it fails
|
[14:42] private_meta
|
So, the output is at it is supposed to look like, but only if there is some waiting between send and receive
|
[14:42] pieterh
|
you read the ID frame but not the content frame
|
[14:43] pieterh
|
can you remove all sleeps, and add printfs until you know the cause of that 'not reading the content frame'?
|
[14:43] private_meta
|
I read ID size and message, and I read content size, but not content message
|
[14:43] pieterh
|
how can you read the content size?
|
[14:43] private_meta
|
How should I know?
|
[14:43] private_meta
|
I just tested it
|
[14:44] private_meta
|
see @15:37
|
[14:44] pieterh
|
:-) I mean, you ask 0MQ for a 'message' and it returns you size+data
|
[14:44] private_meta
|
I get the message
|
[14:44] private_meta
|
message.data returns zip, message.size returns the correct size of the message
|
[14:44] pieterh
|
wow
|
[14:45] private_meta
|
The message should be "Hello", so the size is correct to be "5"
|
[14:45] pieterh
|
I've never seen that before
|
[14:45] private_meta
|
Story of my life
|
[14:46] pieterh
|
what OS are you using?
|
[14:46] private_meta
|
Ubuntu
|
[14:46] private_meta
|
64bit
|
[14:46] pieterh
|
multicore box?
|
[14:46] pieterh
|
latest version of 0MQ?
|
[14:47] pieterh
|
ah, wait...
|
[14:47] pieterh
|
:-) obviously
|
[14:47] pieterh
|
hehe
|
[14:47] private_meta
|
I'm using it on a VirtualBox Virtual Machine
|
[14:47] private_meta
|
zmq 2.1.1
|
[14:48] pieterh
|
you are indeed sending 5 bytes
|
[14:48] pieterh
|
but the buffer you are sending is not hanging around
|
[14:48] pieterh
|
let me double check how you implemented zmq_send...
|
[14:49] pieterh
|
The "I send stuff and receive garbage" symptom comes from sending from a transient buffer
|
[14:49] pieterh
|
socket.send is not copying, I assume
|
[14:50] private_meta
|
hm?
|
[14:50] pieterh
|
I'm not familiar with exactly how the C++ API works, but...
|
[14:51] pieterh
|
but look at how the C zmsg_send function sends a frame:
|
[14:51] pieterh
|
zmq_msg_init_size (&message, size);
|
[14:51] pieterh
|
memcpy (zmq_msg_data (&message), data, size);
|
[14:51] private_meta
|
&message is syntactically impossible there
|
[14:52] pieterh
|
that's not the point (message is a 0MQ message here)
|
[14:52] pieterh
|
the point is memcpy
|
[14:52] private_meta
|
ah, that's what you mean
|
[14:53] private_meta
|
Instead of memcpy I use rebuild... You think rebuild can't be used for that?
|
[14:53] private_meta
|
Don't tell me I can't use the message without memcpy
|
[14:53] pieterh
|
I've no idea but when you send "Hello" and receive "?@$AS" it's because the buffer you sent disappeared before 0MQ could asynchronously finish sending
|
[14:54] private_meta
|
hmm
|
[14:54] pieterh
|
You should be able to send from the zmsg frames, directly
|
[14:54] private_meta
|
What do you mean by that?
|
[14:55] pieterh
|
where is the C++ API documented?
|
[14:55] pieterh
|
I've never used it
|
[14:55] private_meta
|
Good question... I honestly don't know
|
[14:55] mikko
|
zmq.hpp ?
|
[14:55] pieterh
|
ok, checking
|
[14:55] pieterh
|
http://api.zeromq.org/2-1-1:zmq-cpp
|
[14:56] private_meta
|
mikko: there are hardly any comments in there ;)
|
[14:56] pieterh
|
god, that documentation is terrible
|
[14:56] guido_g
|
private_meta: memcpy(msg->data(), s.c_str(), ms);
|
[14:57] pieterh
|
it says rebuild is "equivalent to calling the zmq_msg_close() function followed by the corresponding zmq_msg_init() function."
|
[14:57] guido_g
|
examples are there -> https://github.com/guidog/cpp/tree/master/zmqcpp
|
[14:57] pieterh
|
but there's a world of difference between the different zmq_msg_init() functions"
|
[14:59] pieterh
|
guido_g: seems you already made a multipart message class...
|
[14:59] private_meta
|
I'll test if memcpy does the trick... if it does, that's terrifying
|
[14:59] pieterh
|
private_meta: I'm sure there's a proper way to do it
|
[14:59] guido_g
|
pieterh: yes, long time ago
|
[14:59] pieterh
|
but you need to know more about the C++ API
|
[14:59] private_meta
|
pieterh: it doesn't look like it
|
[15:00] pieterh
|
anyhow, that's your problem here, referring to memory that's deallocated by the time the send happens
|
[15:00] pieterh
|
check the implementation of rebuild
|
[15:00] private_meta
|
No, that's not what I meant
|
[15:00] private_meta
|
I meant, if memcpy is the only way to do that, that's terrible
|
[15:01] pieterh
|
you mean 'memcpy' the function, or 'copying memory in general'?
|
[15:01] private_meta
|
the function in general
|
[15:02] pieterh
|
you can _always_ make it more efficient, the question is whether you need to for the examples
|
[15:02] private_meta
|
I'm not using classes and proper design to fickle around with friggin memcpy (sorry)
|
[15:02] pieterh
|
ah, the function
|
[15:03] private_meta
|
rebuild uses "zmq_msg_init_data"
|
[15:03] pieterh
|
yes, and it doesn't copy anything
|
[15:04] pieterh
|
are you destroying messages on send, as the C code did?
|
[15:06] private_meta
|
More or less
|
[15:07] private_meta
|
Not in the same way you did. I'm using basic_string as objects, and these objects terminate after they are cleared from the vector
|
[15:07] pieterh
|
like guido_g says, add the necessary memcpy after rebuild
|
[15:08] pieterh
|
you can make it cleaner afterwards, presumably using zero copy nicely
|
[15:08] guido_g
|
i didn't care about zero-copy, was too much of a hassle
|
[15:09] guido_g
|
i doubt that we have lots of people here that *really* need it
|
[15:09] private_meta
|
It's rather sad that it now works and I have to use memcpy to make it work
|
[15:09] pieterh
|
private_meta: it doesn't work, it's trying to send data it doesn't own
|
[15:10] private_meta
|
pieterh: not sure if I need it, but what exactly is zero copy and how does it work? I may have a general idea, but I've never heard it before
|
[15:10] pieterh
|
well, you provide 0MQ with the destructor / free function
|
[15:10] pieterh
|
then it'll call it when it's sent the message
|
[15:11] pieterh
|
it avoids the memcpy, basically
|
[15:11] pieterh
|
relevant for sending messages only, not receiving them
|
[15:11] private_meta
|
kk
|
[15:11] private_meta
|
ok, seems to work now
|
[15:12] private_meta
|
want it in the mailing list as well?
|
[15:12] private_meta
|
or maybe guido wants to look over it
|
[15:13] guido_g
|
hm?
|
[15:14] guido_g
|
private_meta: look over what?
|
[15:15] private_meta
|
the c++ version of zmsg.h, thought you might wanna take a look at it, as you seem to be ok with c++
|
[15:17] guido_g
|
which zmsg.h?
|
[15:17] pieterh
|
:-) guido_g: private_meta has been translating examples from the Guide
|
[15:17] guido_g
|
ahhh
|
[15:17] pieterh
|
zmsg is the multipart message class I made for the C examples
|
[15:18] guido_g
|
ic
|
[15:18] pieterh
|
private_meta: to be honest, the code looks nice, I assume you'll call it .cpp not .h...
|
[15:18] guido_g
|
or .hpp if it's to be included
|
[15:19] pieterh
|
the advantage of making the class with the same methods is that the more complex C examples will be easy to turn into C++
|
[15:20] guido_g
|
ouch
|
[15:20] guido_g
|
should change the licence of my python-mdp stuff
|
[15:20] guido_g
|
agpl might be a little too much
|
[15:23] private_meta
|
pieterh: want the updated version via paste on a website or via mailing list?
|
[15:23] private_meta
|
pieterh: The advantage of translating that stuff to the language I need before trying myself is so I get to know errors like the memcpy one, so I'm able to avoid them
|
[15:24] pieterh
|
private_meta: mailing list is best, if you translate a bunch of examples, a pull request is also good
|
[15:24] private_meta
|
pull request?
|
[15:24] pieterh
|
guido_g: yeah, saw that, did not comment :-)
|
[15:24] pieterh
|
private_meta: github pull request
|
[15:25] private_meta
|
Oh, I never used GIT
|
[15:25] guido_g
|
pieterh: suggestions?
|
[15:25] pieterh
|
it's worth learning
|
[15:25] pieterh
|
guido_g: just GPLv3 in this case
|
[15:25] private_meta
|
somehow I don't feel like going through the hassle of installing and configuring it just to do something like a pull request
|
[15:25] guido_g
|
pieterh: oki
|
[15:25] pieterh
|
private_meta: it lets you do things like grab the zguide repository, see versions, changes, etc.
|
[15:26] pieterh
|
in general a good version control system is worth gold, and git is one of the better ones
|
[15:26] private_meta
|
We're using SVN here
|
[15:26] pieterh
|
well, git is to svn like svn is to cvs
|
[15:27] private_meta
|
Can't say, never used it, although the client configuration somehow feels way more complicated than svn
|
[15:28] private_meta
|
and to be honest, download from github works, so i haven't felt the need yte
|
[15:28] pieterh
|
like I say, it's worth learning
|
[15:28] private_meta
|
In case I need it for more than this, I'll consider it :)
|
[15:28] pieterh
|
anyhow, send the class to the list, sure
|
[15:29] private_meta
|
Don't misunderstand me... It's just a matter of experience that when I start looking into GIT now, it will take me an endless amount of time because of the problems I'll face before I can properly use it
|
[15:30] private_meta
|
So my threshold to start using it is higher than normal I guess
|
[15:31] cremes
|
i need a little confirmation on how zmq_msg_t's are handled with inproc transport
|
[15:31] guido_g
|
true, git is not for the faint-hearted
|
[15:31] cremes
|
the sender needs to call close on the message
|
[15:31] cremes
|
and the receiver needs to close it too after it's done with the data buffer
|
[15:31] cremes
|
true?
|
[15:32] private_meta
|
guido_g: Well, I somewhat compare it to people telling you "Well, if you need that and that, just start using Linux... Linux is so easy if you're proficient" but they dare not say that REAL proficiency takes up to 5-10 years with Linux if you're unlucky
|
[15:32] guido_g
|
and infinite time for windows...
|
[15:33] private_meta
|
Don't exaggerate ;)
|
[15:34] private_meta
|
I mean, I'm using Linux, of course, but for those kind of problems, using the Linux analogy again, It's questionable to learn something for 5 years to do something in a minute which would otherwise take 5
|
[15:35] pieterh
|
cremes: is there a special case for inproc...?
|
[15:35] cremes
|
right, that's the question
|
[15:35] pieterh
|
afaik 0MQ copies the data
|
[15:35] cremes
|
or do i handle those messages exactly the same as i would for other transport types?
|
[15:35] pieterh
|
exactly the same
|
[15:35] cremes
|
perfect
|
[15:35] pieterh
|
unless there is a line in the man pages saying otherwise
|
[15:36] cremes
|
nope, but since it just flips pointers around behind the scenes i wondered if there was a special case
|
[15:36] cremes
|
thanks for the answer
|
[15:36] private_meta
|
pieterh: hmm... Now I can finally take a look at majordomo ;)
|
[15:36] pieterh
|
you'd have to ask Sustrik but I'm pretty sure the received message data is a copy
|
[15:36] pieterh
|
private_meta: :-)
|
[15:36] cremes
|
pieterh: i believe you :)
|
[15:38] private_meta
|
pieterh: 11 files prefixed with md are the basic majordomo files, am I missing anything?
|
[15:39] pieterh
|
private_meta: that's it
|
[15:39] pieterh
|
you should IMO start at the top of Ch4
|
[15:39] pieterh
|
because it's easier to understand the MD pattern if you've seen the evolution of it
|
[15:40] private_meta
|
kk
|
[15:41] private_meta
|
You said that for my problem, 1 worker (management service) and n clients, I should integrate the broker into the worker?
|
[15:41] private_meta
|
Do I remember that correctly?
|
[15:41] pieterh
|
well, it's one way
|
[15:41] pieterh
|
I've not documented it yet, that's after titanic
|
[15:41] private_meta
|
-should+can
|
[15:41] private_meta
|
titanic?
|
[15:41] pieterh
|
:-) unsinkable etc. etc.
|
[15:42] private_meta
|
ah, i need to translate zhash and zlist as well it seems
|
[15:42] pieterh
|
nope, in C++ you'll find lists and hashes IMO
|
[15:42] pieterh
|
STL, right?
|
[15:43] private_meta
|
ah ok
|
[15:43] private_meta
|
so you just implemented a normal hash and list class
|
[15:43] pieterh
|
yes
|
[15:43] private_meta
|
ok, good
|
[15:43] private_meta
|
Good to know
|
[15:46] Guthur
|
I'm looking forward to implementing majordomo
|
[15:46] private_meta
|
Ok, let's dive deep into Chapter 4 then
|
[15:46] Guthur
|
need to do Ch4
|
[15:48] guido_g
|
pieterh: guide ch4: doesn't ømq deactivate nagle?
|
[15:48] pieterh
|
guido_g: I'm not sure why it would
|
[15:48] pieterh
|
let me check the source
|
[15:48] guido_g
|
pieterh: would mean at least a 20ms delay for small messages
|
[15:49] pieterh
|
yup, indeed it does
|
[15:49] guido_g
|
ahhh i shouldn't read that now
|
[15:49] pieterh
|
:-) I'll fix that, thanks
|
[15:49] private_meta
|
What's Nagle?
|
[15:49] guido_g
|
*sigh*
|
[15:49] pieterh
|
google is your buddy :-)
|
[15:50] guido_g
|
http://en.wikipedia.org/wiki/Nagle's_algorithm <- copied from the guide, ch4
|
[15:50] guido_g
|
private_meta: remember, the guide! it knows *everything*!
|
[15:50] private_meta
|
ah
|
[15:51] private_meta
|
The answer to life, the universe, and everything?
|
[15:51] guido_g
|
yes
|
[15:51] private_meta
|
nice
|
[15:51] private_meta
|
Google Calculator knows that too
|
[15:52] guido_g
|
who's google?
|
[15:52] private_meta
|
Ask the Guide, it should know
|
[15:52] guido_g
|
it's not in the guide, so it's not existing
|
[15:52] Guthur
|
it's some small startup company
|
[15:52] Guthur
|
doing webby stuff
|
[15:53] Guthur
|
it'll never last...
|
[15:53] guido_g
|
ahh this web crap doesn't have a future
|
[15:54] private_meta
|
Yeah, that'll never last... what would be next? Social Networks? Preposterous
|
[15:55] private_meta
|
pieterh: Just out of curiosity, what does it take for your rfc drafts to become stable?
|
[15:55] pieterh
|
private_meta: that's explained in the sidebar, I think
|
[15:55] guido_g
|
as if ordinary people will spend hours in front of a computer... unthinkable
|
[15:55] Guthur
|
when he stops making changes
|
[15:55] private_meta
|
Ah
|
[15:55] guido_g
|
hrhrhr
|
[15:56] private_meta
|
not seeing the forest for the trees
|
[16:08] pieterh
|
private_meta: the state of a specification is a contract with its users, basically
|
[16:08] pieterh
|
the whole thing is explained here: http://www.digistan.org/spec:1/COSS
|
[16:11] private_meta
|
so, if I use Majordomo somewhere, it would become stable. How far, in your COS System, could a protocol be modified to be still considered as "deployed"?
|
[16:11] Guthur
|
it would be nice to have a few language bindings interop'ing using the majordomo protocol
|
[16:11] pieterh
|
private_meta: once a protocol is deployed to real users it can't be modified any more
|
[16:11] pieterh
|
instead you fork it, make a new version
|
[16:12] private_meta
|
So if I deviate from the protocol slightly, it doesn't count as deployed?
|
[16:12] private_meta
|
Guthur: I'll make a C++ binding for Majordomo
|
[16:12] pieterh
|
deviate? then you're implementing something else
|
[16:12] private_meta
|
kk
|
[16:13] pieterh
|
which of course anyone can do, but the point is you as implementer need assurance the spec won't change randomly
|
[16:14] private_meta
|
Yeah, so the protocol can be referred to, makes oversight easier I guess?
|
[16:14] pieterh
|
what do you mean by 'oversight'?
|
[16:15] private_meta
|
Ah well
|
[16:15] private_meta
|
overlook, for problem solving, I think you mentioned something like that... "So, you need help with that? You kept to that protocol? Good, that's a starting point"
|
[16:16] pieterh
|
yes, indeed
|
[16:16] pieterh
|
the core reason for a formal spec is that multiple teams can work on pieces of the problem
|
[16:16] pieterh
|
so the spec is their contract
|
[16:16] private_meta
|
k
|
[16:16] guido_g
|
works quite well
|
[16:17] pieterh
|
this is why guido_g and I have been arguing over words, if they're unclear, the contract is unclear
|
[16:17] guido_g
|
yeah
|
[16:17] private_meta
|
So if something is unclear in the Protocol it should be mentioned?
|
[16:17] private_meta
|
(sounds obvious)
|
[16:17] pieterh
|
yes, as you implement it, you'll find areas that are unclear
|
[16:18] pieterh
|
assumptions the author made, or things he knows but other people don't
|
[16:18] private_meta
|
Yeah, I assume there might be stuff in there you both are clear about, but others like me may not be
|
[16:18] pieterh
|
especially if the author is also writing a reference implementation
|
[16:19] pieterh
|
in the case of Majordomo, we assume working knowledge of the Guide
|
[16:20] private_meta
|
sure
|
[16:51] Guthur
|
pieterh: is PPP completely superseded by MDP
|
[16:52] pieterh
|
yes
|
[16:52] Guthur
|
PPP was a little short lived
|
[16:53] pieterh
|
Poor PPP... it didn't get much market traction
|
[16:55] stimpie
|
I have installed zmq with --prefix=$HOME/bin/ and Iam trying to build jzmq but it keeps failing with configure: error: cannot find zmq.h even with --includedir pointing to the location
|
[17:18] guido_g
|
ahhh more guide stuff...
|
[17:41] pieterh
|
guido_g: cyl, I'm off for a while...
|
[17:41] guido_g
|
habv fun
|
[17:41] guido_g
|
7me is impmenting mmi
|
[17:41] pieterh
|
:-)
|
[20:41] cremes
|
has anyone seen a "double free" error from zmq_msg_close() before? https://gist.github.com/859170
|
[20:41] cremes
|
it's possible/probably that my code is calling zmq_msg_close() twice and i just havent' found it yet, but
|
[20:41] mikko
|
double freeing message?
|
[20:41] cremes
|
i thought i would ask in case this error popped up under other conditions
|
[20:42] cremes
|
mikko: that's what it says
|
[20:42] cremes
|
the reason i am curious though is because this is one worker of 3
|
[20:42] cremes
|
they are all running the same code off of the same box
|
[20:42] cremes
|
but the same one craps out with this error after about an hour of running
|
[20:42] cremes
|
so, i'm puzzled
|
[20:42] mikko
|
is the message data copied?
|
[20:43] cremes
|
yes
|
[20:43] mikko
|
very hard to say what it could be without seeing larger context
|
[20:43] mikko
|
i guess most likely that zmq_msg_data has been freed before calling zmq_msg_close
|
[20:44] andrewvc
|
cremes, could it be some object finalizer bug?
|
[20:44] cremes
|
andrewvc_: improbable; it happens under jruby and mri 1.9.2 which handle finalizers differently
|
[20:45] cremes
|
i was kind of hoping someone would have seen something similar and knew what mistake i had made
|
[20:45] andrewvc
|
cremes: ah, well nm then
|
[20:46] andrewvc
|
btw, what are your thoughts on finalizers? Evan's not a fan?
|
[20:46] cremes
|
whenever i see a malloc or free bug, i'm reminded that it could just be heap corruption
|
[20:46] cremes
|
and malloc/free are the first to stumble across it and bitch
|
[20:46] cremes
|
andrewvc: i think they are necessary
|
[20:47] andrewvc
|
better than the alternative of manual freeing then
|
[20:47] cremes
|
particularly when you need to interface managed code (ruby/python/etc) with unmanaged memory (c/c++//etc)
|
[20:47] cremes
|
ok, well, i'm off to audit my worker code *again* to see where this could be getting triggered
|
[20:47] andrewvc
|
cremes: good luck :)
|
[20:48] andrewvc
|
let me know if there's a passage that could use a second set of eye
|
[20:48] andrewvc
|
*eyes
|
[20:48] cremes
|
andrewvc: thank you, will do
|
[20:48] mikko
|
cremes: run inside valgrind
|
[20:48] mikko
|
and see if you can reproduc
|
[20:48] mikko
|
would probably give a lot more insight
|
[20:49] cremes
|
mikko: don't i have to recompile 0mq with some make-valgrind-happy flag?
|
[20:49] mikko
|
cremes: nope
|
[20:49] cremes
|
oh, good
|
[20:49] mikko
|
there are just some 'false' alerts
|
[20:49] mikko
|
'x points to unitialized bytes"
|
[21:08] cremes
|
andrewvc: ping
|
[21:08] andrewvc
|
cremes: yo, on a call, i'll be around in a bit
|
[21:08] cremes
|
ok, ping me when you are... i need to pick your brain
|
[22:11] andrewvc
|
cremes: what's up?
|
[22:13] cremes
|
so, i think i found the problem; it requires a change to an api 'contract' in ffi-rzmq
|
[22:14] cremes
|
right now when you call #send it handles calling Message#close for you
|
[22:14] cremes
|
however, this can be a problem if the #send fails
|
[22:14] andrewvc
|
oh, interesting
|
[22:15] cremes
|
zmqmachine has a #send_messages method that automatically sets the right flags for multi-part messages, etc
|
[22:15] cremes
|
and it uses this lower level method from ffi-rzmq to handle the sending
|
[22:15] andrewvc
|
yeah, one of the nice features in it
|
[22:15] andrewvc
|
just Socket#send right
|
[22:15] cremes
|
right
|
[22:15] andrewvc
|
I see an ensure there
|
[22:16] cremes
|
so as i built even more abstractions on top of these functions, i painted myself into a corner
|
[22:16] andrewvc
|
ensure message.close
|
[22:16] cremes
|
yes, i think that's the "bug"
|
[22:16] andrewvc
|
oh?
|
[22:16] cremes
|
let me get you a gist...
|
[22:16] andrewvc
|
cool
|
[22:17] cremes
|
take a look at the logic for #write: https://gist.github.com/859387
|
[22:17] cremes
|
if the call to send_messages returns a failure, it keeps the messages on the internal queue and
|
[22:18] cremes
|
tries to resend them in 10ms
|
[22:18] cremes
|
however, the only way this would fail is if at least one part of a multi-part message failed
|
[22:18] cremes
|
but any parts *before* that one already had Message#close called on them
|
[22:19] cremes
|
so when i try to resend and then close them, kaboom, double free
|
[22:19] andrewvc
|
ohhh
|
[22:19] andrewvc
|
how does one part of a multipart fail?
|
[22:19] andrewvc
|
HWM?
|
[22:19] cremes
|
i don't know
|
[22:19] cremes
|
it's my only theory and the backtraces kind of led me to here
|
[22:20] andrewvc
|
hmmm, I mean, do you have a HWM set?
|
[22:20] cremes
|
regardless, changing the contract for Socket#send moves the api a little closer back to the C version
|
[22:20] andrewvc
|
so, no more 'ensure'
|
[22:20] andrewvc
|
do users need to manually message.close then?
|
[22:21] cremes
|
correct
|
[22:21] cremes
|
on both counts
|
[22:22] andrewvc
|
That sounds OK to me. However, it'll break all current code
|
[22:22] andrewvc
|
any ideas for compat?
|
[22:22] cremes
|
right, that's why i wanted to bring this up with you
|
[22:22] cremes
|
you're the only one (other than me) who has projects that depend on this
|
[22:22] andrewvc
|
hehe, well, others are picking it up, I think they're just quiet
|
[22:22] andrewvc
|
I think there's a silent minority, or not
|
[22:22] cremes
|
so, it's a backward-incompatible change
|
[22:23] andrewvc
|
yeah, it's no big deal to me, I just rely on em-zeromq these days
|
[22:23] cremes
|
it *is* a pre-1.0 release, so this kind of thing can happen
|
[22:23] andrewvc
|
so, it's like 1 line
|
[22:23] cremes
|
good to hear
|
[22:23] cremes
|
does dripdrop need changes too or no?
|
[22:23] andrewvc
|
shouldn't
|
[22:24] andrewvc
|
well, yeah, my bad dripdrop needs one
|
[22:24] andrewvc
|
em-zeromq actually doesn't
|
[22:24] andrewvc
|
since you end up writing straight to the socket there, it's not its responsibility
|
[22:24] andrewvc
|
hmmm, here's an idea
|
[22:24] cremes
|
i've documented the change and breakage in the History.txt / changelog but i don't know if folks really read those
|
[22:24] andrewvc
|
maybe we could deprecate it somehow
|
[22:25] andrewvc
|
one thing I've never liked is the collision between #send and Object#send
|
[22:25] andrewvc
|
I know it's in the ZMQ guidelines, but in ruby, 'send' is special
|
[22:25] andrewvc
|
it might be a good way to phase that out at the same time
|
[22:25] andrewvc
|
maybe deprecate #send, and implement a #send_message
|
[22:25] cremes
|
what's an alternative name that you think works?
|
[22:25] andrewvc
|
or send_msg for brevity
|
[22:26] cremes
|
then i'm going to want to change #recv so that they are symmetrical :)
|
[22:26] andrewvc
|
ohh, damn, yeah, good point
|
[22:27] andrewvc
|
I like short names, recv_msg is better than rcv_message
|
[22:27] andrewvc
|
in terms of consistency
|
[22:27] cremes
|
well, i'm going to pull the trigger on this right now
|
[22:28] cremes
|
i'll do send_msg and recv_msg
|
[22:28] andrewvc
|
awesome, yeah, I was bit trying to dynamically use Object#send on a socket ages ago, this'll be awesome. Usually I justp deprecate by throwing a STDERR.write("Socket#send is deprecated, use Socket#send_msg")
|
[22:28] andrewvc
|
in there
|
[22:28] cremes
|
you could always use _send_
|
[22:28] andrewvc
|
dunno if that might break people's apps terribly, but on the other hand it *would* annoy them terribly, to great effect
|
[22:28] cremes
|
i think that's an alias
|
[22:28] andrewvc
|
I never knew about that, is that on all objects?
|
[22:28] cremes
|
yes
|
[22:29] andrewvc
|
not on my ruby
|
[22:29] andrewvc
|
oh, in ffi-rzmq?
|
[22:29] cremes
|
no, it's a ruby thing
|
[22:29] andrewvc
|
ruby-1.9.2-p180 :008 > Object._send_
|
[22:29] andrewvc
|
does not work
|
[22:30] andrewvc
|
undefined method
|
[22:30] cremes
|
oh, it's __send__ with two underscores
|
[22:30] andrewvc
|
ohhhhh
|
[22:30] cremes
|
i had to look that up
|
[22:30] andrewvc
|
interesting, maybe I should use that elsewhere
|
[22:30] andrewvc
|
I'd argue that's a bit of a deficiency in ruby, shoulda been named send_method or something
|
[22:31] andrewvc
|
because what programmer would ever want a method named 'send' in their library
|
[22:31] cremes
|
heh
|
[22:32] andrewvc
|
we have define_method, but no send_method
|
[22:32] andrewvc
|
anway... so, basically now, you send a message, and must close each one you send
|
[22:32] cremes
|
correct
|
[22:32] cremes
|
just like the c api
|
[22:32] andrewvc
|
how does this affect send_string
|
[22:32] andrewvc
|
kinda makes that a bit odd eh?
|
[22:32] cremes
|
that will get a line added to it to release the message
|
[22:33] cremes
|
it's never exposed to the end user anyway so they don't even know a Message object was created there
|
[22:33] cremes
|
it's behavior will be unchanged
|
[22:33] andrewvc
|
yeah, so, if I send a multi-part message based out of strings
|
[22:34] andrewvc
|
send_string will still have this bug no?
|
[22:34] cremes
|
no
|
[22:34] cremes
|
it doesn't try to resend the *same* Message object
|
[22:34] cremes
|
it allocates a new one for every #send_string call
|
[22:34] andrewvc
|
oh, gotcha
|
[22:34] cremes
|
y'know, i'm bothered by this deprecation thing
|
[22:35] andrewvc
|
yeah, it isn't nice, maybe we should think about it a bit
|
[22:35] cremes
|
the way it works now is just wrong
|
[22:35] cremes
|
maybe i could create a #fire_and_forget that behaves like the original #send
|
[22:35] cremes
|
it would manage the message close for you
|
[22:36] cremes
|
so the functionality could still be in the library but it wouldn't be the default
|
[22:36] andrewvc
|
hmmm
|
[22:36] cremes
|
#send_and_close(message) ??
|
[22:36] andrewvc
|
much better
|
[22:37] cremes
|
i like that better too
|
[22:41] cremes
|
andrewvc: done... https://gist.github.com/859387
|
[22:51] andrewvc
|
lookin good
|
[23:41] cremes
|
what are the conditions that raise this assertion?
|
[23:41] cremes
|
Assertion failed: !more || pipes [current] != pipe_ (fq.cpp:62)
|
[23:44] cremes
|
i am seeing it when writing a 2-part message to a pub socket using an inproc transport
|
[23:45] cremes
|
it's not super high-volume either... maybe 1000 msgs/sec
|
[23:45] cremes
|
size of the first message part is under 10 bytes and the second part is under 200 bytes
|
[23:46] cremes
|
i'll try to put together some C code to reproduce
|
[23:46] cremes
|
it
|