[Time] Name | Message |
[05:39] CIA-20
|
zeromq2: 03Martin Sustrik 07maint * rf61921d 10/ src/req.cpp : REQ socket can die when reply is delivered on wrong unerlying connection -- fixed - http://bit.ly/cX2rpW
|
[05:46] CIA-20
|
zeromq2: 03Dhammika Pathirana 07maint * rc1deb22 10/ src/ypipe.hpp : crash when closing an ypipe -- fixed - http://bit.ly/bmxM8U
|
[05:53] CIA-20
|
zeromq2: 03Martin Sustrik 07master * rf61921d 10/ src/req.cpp : REQ socket can die when reply is delivered on wrong unerlying connection -- fixed - http://bit.ly/cX2rpW
|
[05:53] CIA-20
|
zeromq2: 03Dhammika Pathirana 07master * rc1deb22 10/ src/ypipe.hpp : crash when closing an ypipe -- fixed - http://bit.ly/bmxM8U
|
[05:53] CIA-20
|
zeromq2: 03Martin Sustrik 07master * r6715f9b 10/ src/ypipe.hpp :
|
[05:53] CIA-20
|
zeromq2: Merge branch 'maint'
|
[05:53] CIA-20
|
zeromq2: * maint:
|
[05:53] CIA-20
|
zeromq2: crash when closing an ypipe -- fixed - http://bit.ly/cxlL0o
|
[09:20] keffo
|
pieterh, around?
|
[09:21] pieterh
|
keffo: hi!
|
[09:23] keffo
|
Busy?
|
[09:24] pieterh
|
always, but shoot...
|
[09:24] keffo
|
I'm in the process of solidifying how the network is monitored.. Currently I gather various info & statistics in the loadbalancer which publishes regularly(~5s), and a WPF app subscribing and displaying nice graphs etc
|
[09:25] pieterh
|
sounds good
|
[09:25] keffo
|
but I cant really figure out a good way of limiting it.. I dont want to send a complete state every 5s
|
[09:25] pieterh
|
how large is the state?
|
[09:26] keffo
|
Depends, both on what info I decide to publish.. I'd like load, bandwidth usage, connected nodes and their respective stats(cpu/ram/etc), but also an overview of what is happenening, as well as more detailed info for each worker
|
[09:27] pieterh
|
estimated size? in bytes?
|
[09:27] keffo
|
geesh, no idea, not in the mb range at least :)
|
[09:27] pieterh
|
if you have no idea, it's not sensible to think about optimizing it
|
[09:27] pieterh
|
so do a back-of-envelope calculation and come up with a figure...
|
[09:28] keffo
|
I wondered if it was sound to have the monitoring app poll a complete current-state at startup, and then depend on a persistent sub-forwarder to handle deltastates? But that sounds very complicated
|
[09:28] keffo
|
I can't really guestimate, the number of nodes can range from the local setup I have here of a few machines, to much larger..
|
[09:28] pieterh
|
...
|
[09:29] pieterh
|
how big is "much larger"?
|
[09:29] keffo
|
ideally wan :)
|
[09:29] pieterh
|
please stick a number onto it...
|
[09:29] keffo
|
but being realistic, perhaps around 20?
|
[09:29] pieterh
|
and how large would the state be per worker?
|
[09:30] pieterh
|
please stick a number onto it...
|
[09:30] pieterh
|
then multiply the two numbers and add something for the overview
|
[09:30] keffo
|
Basic info(linpack measurements), uptime, average load, around that
|
[09:30] pieterh
|
come back when you have a total in KB per 5 seconds, a'ight?
|
[09:32] keffo
|
That's not what I'm interested in, the assumption here is that no data is published that isn't "needed", but I would like to figure out the most efficient means of passing around that data
|
[09:32] keffo
|
basically how to monitor a distributed system with the least amount of overhead as possible..
|
[09:32] pieterh
|
this is for research purposes rather than an actual use case...
|
[09:32] keffo
|
(regardless of what the data actually is)
|
[09:33] keffo
|
well both I guess, research first, use later :)
|
[09:33] pieterh
|
well, you can wait until Ch4 of the Guide if you want to
|
[09:33] keffo
|
Surely this problem has been dealt with before, as loadbalancing has :)
|
[09:33] pieterh
|
but here is how I'd do it...
|
[09:33] pieterh
|
- maintain state in the publisher
|
[09:33] pieterh
|
- apply updates to state and publish updates to pub socket
|
[09:34] pieterh
|
- in subscriber, request state via req/rep socket
|
[09:34] pieterh
|
- and also subscribe to updates
|
[09:34] pieterh
|
- queue incoming updates
|
[09:34] pieterh
|
- as soon as state arrives, apply updates to state and continue to do this
|
[09:34] keffo
|
m, that's what I was thinking as well.
|
[09:35] pieterh
|
i think it's robust but need to prove it
|
[09:35] keffo
|
Things do start to get hairy if the monitor app starts to depend on delta-updates though, like "join/part" of nodes
|
[09:35] keffo
|
you're doing this type of stuff for ch 4?
|
[09:35] pieterh
|
yup
|
[09:36] pieterh
|
stateful pubsub
|
[09:36] pieterh
|
or whatever this is properly called...
|
[09:36] keffo
|
I'll try it out, see if it behaves well.. :)
|
[09:37] pieterh
|
feel free to write it up as a recipe or code sample
|
[09:37] pieterh
|
if i can reuse that for the guide it'll save me time
|
[09:37] keffo
|
I'll let you know for sure yeah
|
[09:38] pieterh
|
the main reason for this is not so much to save network bandwidth but to allow realtime updates
|
[09:38] keffo
|
I was aso thinking about 'history-nodes' as well.. something appealing about that, someone who keeps track of what's going on, but isn't directly part -of- the system
|
[09:38] pieterh
|
indeed, this work could be totally outsourced to a stateful device
|
[09:38] keffo
|
non-gonzo network monitoring
|
[09:39] pieterh
|
with some notion of state + patches
|
[09:39] pieterh
|
like pair/value updates
|
[09:39] pieterh
|
hmm, nice
|
[09:39] pieterh
|
it's a distributed cache
|
[09:40] keffo
|
yes, distributed tuple-storage
|
[09:40] pieterh
|
yup, that's the thing
|
[09:40] keffo
|
I cant help but thinking most source control software face much of the same issues here
|
[09:40] pieterh
|
in the general solution any node can update its cache
|
[09:41] pieterh
|
it is a very useful general solution to state distribution
|
[09:41] keffo
|
oh yeah, one can think of it as cache hits and misses, makes things clearer
|
[09:42] pieterh
|
i'd start with a simple model, one publisher, many subscribers, pair/value updates
|
[09:43] keffo
|
It reuses the same mechanisms as the rest
|
[09:44] keffo
|
maybe I could contribute some C# stuff to go along with ch 4.. I do find it quite lacking on the site..
|
[09:45] pieterh
|
keffo: we've gotten one C# example yesterday, but others would be great
|
[09:46] pieterh
|
do start at ch1 if you could, it's quite trivial stuff but useful to newcomers
|
[09:46] keffo
|
yeah..
|
[09:46] keffo
|
need a complete binding, for starters :)
|
[09:47] pieterh
|
the binding is not complete?
|
[09:47] keffo
|
no, there was no poll for example?
|
[09:47] keffo
|
or did I somehow get an old version?
|
[09:48] keffo
|
http://github.com/zeromq/clrzmq/blob/master/clrzmq/zmq.cs
|
[09:48] keffo
|
nope..
|
[09:48] pieterh
|
who maintains this binding?
|
[09:49] pieterh
|
ask them to fix it or submit a patch
|
[09:49] keffo
|
not sure, says sustrik last commit I guess
|
[09:49] pieterh
|
hmm, the owner of every project should be WRITTEN IN HUGE LETTERS
|
[09:49] keffo
|
then there's http://nzmq.codeplex.com as well
|
[09:49] pieterh
|
otherwise it's kind of dead by definition
|
[09:49] pieterh
|
nzmq is something layered on top afaics
|
[09:49] pieterh
|
different API
|
[09:50] keffo
|
yes, but it does the same lowlevel binding of the dll
|
[09:50] pieterh
|
a'ight
|
[09:51] keffo
|
So merging them shouldnt be very difficult
|
[09:51] keffo
|
on the list?
|
[09:51] pieterh
|
"What's the latest version of ZeroMQ that's not LGPL? I need to static link in commercial projects and the LGPL is not an option. "
|
[09:51] pieterh
|
on http://www.zeromq.org/blog:rfc-0mq-contributions
|
[09:53] pieterh
|
not a single sensible comment on that thread, just trolls coming to complain that [sic] switching to LGPL will kill 0MQ...
|
[09:53] keffo
|
will it? =)
|
[09:54] pieterh
|
oh, yes, of course...
|
[09:54] pieterh
|
that's why we have to go back in time and switch to the Microsoft Open Software License or whatever...
|
[09:55] keffo
|
licensing is tricky business, it went and got itself hugely complicated
|
[09:55] keffo
|
I prefer "mine" and "public domain" :)
|
[09:55] pieterh
|
it's not really tricky, just politically sensitive because it involves so much money
|
[09:56] pieterh
|
every license is the contract on which the community grows
|
[09:56] pieterh
|
LGPL and GPL are IME proven beyond a reasonable doubt to be the most effective contracts
|
[09:56] pieterh
|
because they make it impossible to cheat
|
[09:56] pieterh
|
end.
|
[09:56] PerfDave
|
Not *impossible*, see gpl-violations.org. But very difficult ;)
|
[09:57] pieterh
|
impossible in any sustainable sense
|
[09:57] pieterh
|
and GPLv3 closed the loopholes people found in GPLv2
|
[09:57] keffo
|
who has the time anyway :)
|
[09:57] pieterh
|
oh, people love to cheat
|
[09:57] pieterh
|
but communities die when they get parasited
|
[09:58] pieterh
|
so these trolls come and complain that LGPL will kill the community when in fact it creates it
|
[09:58] pieterh
|
sigh.
|
[09:58] keffo
|
:)
|
[10:03] keffo
|
what's the most elaborate zmq based project anyway?
|
[10:07] pieterh
|
keffo: wow, there are some very elaborate ones out there
|
[10:07] pieterh
|
but most are so secret that I'd have to kill you after explaining them
|
[10:07] keffo
|
Oh I just mean scale, not impl. details :)
|
[10:08] keffo
|
I'd like to know what level I'm at :)
|
[10:08] pieterh
|
scale: hundreds to thousands of nodes
|
[10:09] pieterh
|
multiple data centers
|
[10:10] keffo
|
are they mostly about shuffling data around, or generic compute clusters?
|
[10:10] pieterh
|
both cases
|
[10:10] keffo
|
interesting stuff
|
[10:10] pieterh
|
it's also growing rapidly
|
[10:11] keffo
|
I would assume so!
|
[10:11] pieterh
|
the first 0MQ projects a year or two ago were maybe 10 nodes
|
[10:11] pieterh
|
i'd say the scale is growing x10 every six months or so
|
[10:12] keffo
|
It would be nice if the license included algorithmic contributions as opposed to solely sourcecode :)
|
[10:12] pieterh
|
well, ideas cannot be copyrighted
|
[10:13] keffo
|
A lot of information was probably gathered during development of those, which ideally should be shared :)
|
[10:13] pieterh
|
well
|
[10:13] pieterh
|
whenever possible we do move experience into the open source layers
|
[10:14] pieterh
|
however there are often valuable business secrets in these algorithms
|
[10:14] pieterh
|
obviously we do not consider sharing those
|
[10:14] keffo
|
I was more thinking along the lines of "this common-practice method breaks down under these conditions" etc
|
[10:15] pieterh
|
this is what the user guide will eventually cover
|
[10:15] pieterh
|
at least the more common cases
|
[10:15] pieterh
|
we can also try to document some of the higher level patterns as protocols
|
[10:15] keffo
|
Some of the things I've found annoyingly void so far has been the loadbalancing(which is now covered well enough), and also recursive behaviour, which I think I've solved
|
[10:16] pieterh
|
have you been using the custom routing from Ch3?
|
[10:16] keffo
|
Yeah, pretty much, but with prioqueues
|
[10:16] pieterh
|
what is a prioqueue?
|
[10:16] keffo
|
both for incoming tasks and outgoing results
|
[10:16] keffo
|
priority queue
|
[10:16] pieterh
|
ah, so queues in your broker rather than using the socket queues
|
[10:17] keffo
|
indeed
|
[10:17] pieterh
|
i just didn't want to start mucking with data structures
|
[10:17] pieterh
|
but I think it's inevitable
|
[10:17] pieterh
|
i already had to define a zmsg class, will probably define a zqueue class as well
|
[10:17] keffo
|
messages and tasks have priorities, as well as workers based on a mix of scimark(hardware) and something like a running average network 'behaviour'
|
[10:17] pieterh
|
right
|
[10:18] keffo
|
Another thing I've been struggling to figure out a 'pretty' solution to is how to make workers present their updates when doing long running jobs..
|
[10:18] pieterh
|
intermediate updates?
|
[10:18] keffo
|
That' sortof ties into what we talked about earlier with the Monitoring app
|
[10:19] keffo
|
yeah
|
[10:19] pieterh
|
two threads in the workers, I assume
|
[10:19] pieterh
|
workers as micro clusters
|
[10:19] pieterh
|
it's fractal :-)
|
[10:19] pieterh
|
every node can be a cluster of nodes
|
[10:19] keffo
|
leaving it solely up to the designer of the job (ie, they post progress at will) leads to abuse most likely
|
[10:20] pieterh
|
sounds like you're solving a lot of interesting problems
|
[10:20] keffo
|
indeed :)
|
[10:20] keffo
|
most interesting of all is the fact that a job can post new jobs :)
|
[10:20] pieterh
|
you should write about it, if you can
|
[10:20] keffo
|
that's a brain teaser if anything :)
|
[10:20] keffo
|
yeah
|
[10:20] keffo
|
It's becoming quite large to be honest :)
|
[10:20] pieterh
|
well, stack-based simulated recursion is an old technique
|
[10:20] keffo
|
but it's working well
|
[10:20] pieterh
|
it's how we used to do quicksort in cobol
|
[10:21] keffo
|
stack-based?
|
[10:21] pieterh
|
your prioqueue can also be a stack
|
[10:21] pieterh
|
you can push jobs to the front
|
[10:21] pieterh
|
or to the back
|
[10:21] keffo
|
ah yeah, that's the priority of the messages :)
|
[10:21] pieterh
|
that's how you simulate recursion
|
[10:21] keffo
|
he deeper the recursion, the higher the priority
|
[10:21] pieterh
|
priority is perhaps the wrong metaphor
|
[10:22] pieterh
|
in fact it's "push these child jobs" followed by "pop next job and execute"
|
[10:23] pieterh
|
hah, I found an old paper on this: http://www.arnoldtrembley.com/svalgard.htm
|
[10:24] pieterh
|
If you look at section 3, you see how Quicksort (recursion) works using a stack
|
[10:24] keffo
|
My solution was to have each worker-node (which owns one worker-process per cpu-core) can request to have an additional worker spawned while it sleeps.. So when it posts a child-job, it does so with a higher priority
|
[10:24] pieterh
|
http://www.arnoldtrembley.com/pseudor2.htm
|
[10:24] keffo
|
interesting
|
[10:25] pieterh
|
Leif and I developed these techniques in the 80's...
|
[10:25] keffo
|
how old are you? =)
|
[10:25] pieterh
|
it's very basic but I think it maps correctly to recursive messaging
|
[10:25] pieterh
|
not so old
|
[10:25] pieterh
|
:-)
|
[10:25] pieterh
|
47, to be accurate
|
[10:26] keffo
|
that was grad student times then? =)
|
[10:26] pieterh
|
nope, first job developing tools for large software houses
|
[10:26] pieterh
|
event-driven concurrency in cobol
|
[10:27] keffo
|
obol of all things :)
|
[10:27] keffo
|
there's a scary amount of need for cobol developers now
|
[10:28] pieterh
|
lol...
|
[10:28] pieterh
|
we used to train people to become cobol developers in like 3 weeks
|
[10:29] keffo
|
That's even more scary :)
|
[10:29] keffo
|
like todays php crowd I guess?
|
[10:29] pieterh
|
hmm, yeah, I guess
|
[10:29] pieterh
|
Cobol was good for mediocre programmers, they could make stuff that worked, and didn't kill the system
|
[10:30] keffo
|
I'd be very afraid if my bank announced they moved from cobol to php :)
|
[10:30] pieterh
|
indeed
|
[10:30] keffo
|
or voting/pacemakers :)
|
[11:33] CIA-20
|
jzmq: 03Stefan Majer 07master * rd46166f 10/ src/org/zeromq/ZMQ.java : Merged from upstream - http://bit.ly/c9pkwW
|
[11:33] CIA-20
|
jzmq: 03Stefan Majer 07master * r05b384d 10/ src/org/zeromq/ZMQ.java : Reduced duplicate Javadoc comments by references to the corresponding setter. - http://bit.ly/dvaPdr
|
[11:33] CIA-20
|
jzmq: 03Stefan Majer 07master * r1a98406 10/ src/org/zeromq/ZMQ.java : References to the man pages to further clarify the Javadoc. - http://bit.ly/c5FVwB
|
[11:33] CIA-20
|
jzmq: 03Gonzalo Diethelm 07master * rc7c9929 10/ src/org/zeromq/ZMQ.java : Merge branch 'master' of http://github.com/majst01/jzmq into majst02-master - http://bit.ly/93TVKZ
|
[13:27] CIA-20
|
zeromq2: 03Gonzalo Diethelm 07master * r87beaaa 10/ (14 files in 4 dirs): ZMQ_TYPE socket option added - http://bit.ly/b3HZYe
|
[13:32] CIA-20
|
zeromq2: 03Martin Sustrik 07master * r10bb9d0 10/ AUTHORS : Dhammika Pathirana was missing from the AUTOHRS file for some reason -- fixed - http://bit.ly/cz4TZ7
|
[14:39] CIA-20
|
zeromq2: 03Steven McCoy 07master * r00cd7d4 10/ (7 files in 3 dirs): Upgrade to OpenPGM-5.0.78 - http://bit.ly/aKqJjZ
|
[15:01] CIA-20
|
zeromq2: 03Steven McCopy 07master * r1dc4531 10/ (4 files): (log message trimmed)
|
[15:01] CIA-20
|
zeromq2: * Add assertions to check for OpenPGM calls with invalid parameters.
|
[15:01] CIA-20
|
zeromq2: * Assertion to check that pgm_getaddrinfo is actually returning something.
|
[15:01] CIA-20
|
zeromq2: * Missing pgm_connect call.
|
[15:01] CIA-20
|
zeromq2: * Typo on TOS causing immediate abort.
|
[15:01] CIA-20
|
zeromq2: * Placeholder calls for timeouts whilst continuing spin loop functionality.
|
[15:01] CIA-20
|
zeromq2: * OpenPGM v5 now supports reference counting so remove init checks.
|
[15:28] psino
|
I'm having some issues with understanding how a PUSH socket with no downstream nodes is supposed to behave. There seem to be a difference between the sockets whether they have been used as .bind or .connect
|
[15:28] psino
|
I have a small test case here: https://gist.github.com/a7e8b4fdfc303ce4b0e6
|
[15:28] psino
|
the connect_socket sends the data (send does not block) even if there are no downstream nodes, but the bind_socket blocks
|
[15:30] psino
|
is this the expected behaviour? from what I understand after reading http://api.zeromq.org/zmq_socket.html is that both should have been blocking
|
[15:31] psino
|
As its said here: "When a ZMQ_PUSH socket enters an exceptional state due to having reached the high water mark for all downstream nodes, or if there are no downstream nodes at all, then any zmq_send(3) operations on the socket shall block until the exceptional state ends or at least one downstream node becomes available for sending; messages are not discarded."
|
[15:37] sustrik
|
psino: yes. it works that way
|
[15:38] sustrik
|
on connect, socket can immediately create a queue to store messages in -- even before actual connection is established
|
[15:38] psino
|
so the manual is incorrect/imprecise?
|
[15:39] sustrik
|
when binding, the socket has to wait till peers connect, it cannot create a queue itself (it doesn't even know whether there'll be a connection in the future)
|
[15:39] sustrik
|
imprecise, i would say
|
[15:39] sustrik
|
connect assumed that there's a peer
|
[15:40] sustrik
|
exactly 1 peer
|
[15:40] sustrik
|
bind makes no such assumption
|
[15:41] psino
|
hmm
|
[15:52] jashmenn
|
hey can anyone give me some guidance on socket migration between threads
|
[15:52] jashmenn
|
i'm having a tough time figuring out how that works
|
[15:52] jashmenn
|
basically i want to close a socket from a thread different than the thread that started it
|
[15:58] mato
|
sustrik: yo
|
[15:59] sustrik
|
mato: hi
|
[15:59] mato
|
sustrik: i would have appreciated a heads up on the OpenPGM commits
|
[16:00] mato
|
sustrik: it does touch the build system, of which you declared me the maintainer :-)
|
[16:00] sustrik
|
ah
|
[16:00] sustrik
|
i blindly applied the patches :|
|
[16:00] mato
|
also, isn't the author of those commits steve mccoy?
|
[16:00] sustrik
|
yes, he is
|
[16:00] mato
|
ah, it's there
|
[16:00] mato
|
'cept you have a typo
|
[16:00] mato
|
but that's my fault
|
[16:01] sustrik
|
?
|
[16:01] mato
|
sustrik: you should not be typing author names manually if at all possible
|
[16:01] sustrik
|
no idea how to do it automatically
|
[16:01] mato
|
git-am :-)
|
[16:02] sustrik
|
good god!
|
[16:02] mato
|
sustrik: anyway, please do ask for review before committing stuff that touches other people's designated areas of responsibility
|
[16:02] sustrik
|
anyway, can you check the part that touches the build system post hoc?
|
[16:02] mato
|
yes, i can, i will
|
[16:02] sustrik
|
thanks
|
[16:02] mato
|
but please don't do that in future :-)
|
[16:02] mato
|
including e.g. docs
|
[16:02] sustrik
|
sure
|
[16:03] sustrik
|
eek
|
[16:03] mato
|
eek?
|
[16:03] sustrik
|
i've just committed gonzalos patch
|
[16:03] sustrik
|
which touchs docs
|
[16:03] sustrik
|
ZMQ_TYPE socket opt description
|
[16:03] mato
|
it also has a commit to .gitignore for bin/ for some reason
|
[16:03] mato
|
as part of gonzalo's patch
|
[16:03] sustrik
|
hm
|
[16:04] sustrik
|
it shouldn't be there but once it's there it doesn't hurt
|
[16:04] mato
|
sure
|
[16:04] mato
|
but my point is be careful about what is in a patch
|
[16:04] sustrik
|
what are we going to do with patches that intersect different areas of functionality?
|
[16:04] sustrik
|
there's no clear committer there
|
[16:05] mato
|
ask the interested parties for review
|
[16:05] mato
|
that's how it normally works
|
[16:05] sustrik
|
ok, will do
|
[16:05] mato
|
also, if you want my attention quickly a good way to do that is to Cc: me directly as well as sending email to the list
|
[16:05] sustrik
|
ack
|
[16:05] mato
|
that way it lands in my INBOX rather than in the auto-filtered-away list folder :-)
|
[16:42] CIA-20
|
jzmq: 03Gonzalo Diethelm 07master * re3e6b7f 10/ (src/Socket.cpp src/org/zeromq/ZMQ.java): Added support for getType() and added some missing constants. - http://bit.ly/at8wUF
|
[17:30] vharron
|
Is there a list of projects using zeromq? I'm trying to understand what it does well by looking at the problems it solves.
|
[17:52] drbobbeaty
|
vharron: I can't give you a list of projects, but I can tell you about mine. I'm using ZMQ as a transport layer using OpenPGM and the reliable multicast. I'm using this because what I need is a transport more than a full-fledged messaging system like Tibco or 29West. It's working out wonderfully in this capacity and we're moving a lot of messages through the system.
|
[17:53] drbobbeaty
|
It's not a real full-blown messaging system - but you can build one with it. I just see it's real target as a very advanced transport system with a lot of flexibility.
|
[20:47] CIA-20
|
zeromq2: 03Steven McCoy 07master * ra729357 10/ (6 files): more fixes to (e)pgm transport - http://bit.ly/9E1o2T
|
[23:14] lluad
|
If I have multiple clients connecting to a single server over TCP, is there a good idiom for the server to send an unsolicited message to just one of the clients?
|
[23:20] Zao
|
lluad: When connecting a client, send a connection string to the server if it needs to communicate with you?
|