Monday February 7, 2011

[Time] NameMessage
[00:00] Guthur what happens if you are polling and then subsequently close the socket?
[01:27] Guthur mikko, the build server is dead?
[04:39] kabs Hello, if subscriber in pub-sub model goes down or restart we have the option to queue its data at publisher side by using HWM or SWAP but how can we make sure our system doesn't fail if publisher goes down?
[05:03] jhawk28 kabs, you have to persist the data in your program
[05:04] jhawk28 and determine what is acceptable for restarts
[05:04] jhawk28 is it ok to send data to the subscriber again
[05:06] kabs Not sure as of now but I am looking for the options
[05:07] jhawk28 if you need strong durability, and are not concerned so much with performance, AMQP may be a better fit
[05:08] jhawk28
[05:09] kabs jhawk28:Can you elaborate on data persistence here, since if pub is crashed all it's buffers( I/O, network buffers ) will be gone, how can we have persistence of those buffers ??
[05:09] jhawk28 it would be external to the ZMQ library
[05:10] jhawk28 in the application space
[05:11] jhawk28 so you could just log data as it was published
[05:12] jhawk28 then when the publisher was restarted, it would just replay the log
[05:13] jhawk28 you could sequence it and use a periodic ack using a pull socket to know how far back, but then you need to keep track of acks for all subscribers (which slows performance)
[05:14] jhawk28 it creates durability, but has a penalty for performance in addition to breaking the desired contract of sending a message once and only once
[05:16] kabs jhawk28: it seems there is no ideal system, some system has some penalty , etc ...
[05:16] jhawk28 correct. durable pub is hard
[05:16] kabs jhawk28: so this pull and ack is to maintain identity
[05:17] jhawk28 is a full blown system just for it
[05:17] jhawk28 you can get ideas from
[05:20] kabs jhawk28: thank you for your input, will see it , just gave it a glance and found one beautiful line "Don't fear the filesystem!" :)
[09:37] ianbarber hi all, anyone got any PGM ideas about this assertion:
[09:37] ianbarber Assertion failed: rc == 0 (connect_session.cpp:82)
[09:37] ianbarber from a zmq_connect (pub, "epgm://;");
[09:38] ianbarber as far as I can see it's being thrown due to a failure in pgm_socket.cpp, possible if (!pgm_setsockopt (sock, IPPROTO_PGM, PGM_SEND_ONLY,
[09:38] ianbarber &send_only, sizeof (send_only))
[09:38] ianbarber due to that
[09:40] guido_g you dont't specify an interface, check that the loopback is *not* used
[09:40] guido_g loopback is nit multicast capable
[09:41] guido_g s/nit/not/
[09:43] ianbarber ah, good point. just tried with eth0, same result
[09:44] ianbarber is there a way of telling whether the interface is capable, or enable/disabling it? i am on a VMWare VM, so it is entirely possible there is something odd about the network interface
[09:44] guido_g ifconfig should tell
[09:44] ianbarber yeah, MULTICAST is in there
[09:44] ianbarber hmm
[09:46] ianbarber i get the same thing with PGM as well as epgm (not massively suprisingly I'm sure)
[09:56] mikko ianbarber: increase ZMQ_RATE
[09:56] mikko seems to fix the symptoms
[09:56] mikko not sure what the actual issue is
[09:58] ianbarber ah, right, let me try that
[09:58] ianbarber what's rate default to then?
[09:59] guido_g 100kbit i think
[10:01] suzan_shakya Hi all, is there any method for listening on 2 sockets other than zmq.Poller()
[10:02] ianbarber awesome, that seemed to work (once I remember to define it as an int64_t rather than an int)
[10:02] ianbarber thanks mikko!
[10:03] ianbarber (and thanks guido for the help as well :))
[10:15] sustrik ianbarber: can you report the problem on the mailing list so that steve-o can have a look at it?
[10:16] sustrik suzan_shakya: you mean two 0mq sockets or two peers?
[10:17] ianbarber sustrik: sure thing
[10:22] mikko
[10:22] mikko s390x building
[10:23] mikko and fails
[10:25] ianbarber oh, cool that that's on the build system, even if it is failing :)
[10:26] sustrik the best thing is that's it proves that distributed build system is doable :)
[10:27] sustrik error: possibly undefined macro: m4_esyscmd_s
[10:27] mikko yeah, just trying to figure out what system it is
[10:27] mikko might very well be a RHEL
[10:27] sustrik that's the last patch that was made to the build system
[10:27] mikko yes
[10:27] mikko m4_esyscmd_s is present in autoconf 2.61
[10:28] mikko which is the minimum requirement
[10:28] mikko not sure what version that system runs
[10:28] sustrik aha
[10:28] mikko encouraging, no google results for "possibly undefined macro: m4_esyscmd_s"
[10:30] mikko works in 2.68 autoconf
[10:30] mikko that system runs 2.63
[10:30] mikko very strange
[10:33] mikko hmm, sustrik i think it's better to revert the patch
[10:33] jugg :(
[10:33] sustrik jugg: what was the problem it solved?
[10:33] mikko i think the behaviour of m4_esyscmd_s is a lot more inconsistent over different platofrms
[10:34] jugg debian 6 systems failing
[10:34] mikko jugg: does it fail or warn?
[10:34] mikko if it fails then we need to put more attention to this
[10:35] mikko (the daily build cluster master is debian 6.0)
[10:35] sustrik warning: AC_INIT: not a literal: m4_esyscmd([./ | tr -
[10:35] sustrik d '\n'])
[10:35] jugg looks like a warning, I'm not sure if ./configure works after, I did not try.
[10:35] jugg yes, what sustrik said
[10:36] sustrik jugg: can you check whether the package builds ok in spite of the warning?
[10:36] jugg one moment...
[10:36] mikko m4_esyscmd_s seems to present on all platforms which had daily builds but not on the new s390x
[10:38] mikko another option would be to use 'echo -n' instead of 'echo' in and revert to m4_esyscmd
[10:42] jugg everything seems to build fine without that fix
[10:43] jugg without the fix, ./ produces that warning 6 times.
[10:44] mikko jugg: what about AC_INIT([zeromq],[m4_esyscmd([./])],[])
[10:44] mikko and change to use 'echo -n ' at the end
[10:46] mikko echo -n is not portable
[10:47] jugg it worked however...
[10:50] mikko bash shell most probably
[10:50] mikko ksh:
[10:50] mikko $ echo -n hi
[10:50] mikko -n hi
[10:53] mikko hmm, what about if we moved the tr -d '\n' to the
[10:54] mikko that should be as portable as the old way
[10:56] mikko jugg:
[10:56] mikko can you test that?
[11:01] jugg works
[11:07] mikko works on s390 as well
[11:17] mikko what on earth
[11:17] mikko "conftest.c:15: error: 'Syntax' does not name a type
[12:14] sustrik mikko: should i apply the patch or not?
[13:01] kristsk hello there.
[13:01] kristsk is it posssible to send FDs (file descriptors) around with zeromq - if processes reside in same box?
[13:10] sustrik FDs are process-specific
[13:11] sustrik no way of passing them between processes
[13:33] kristsk sustrik - there is a way.
[13:34] sustrik ?
[13:35] kristsk
[13:35] kristsk 2. Master socket - ...
[13:37] sustrik right
[13:37] sustrik i almost forgot about that
[13:37] kristsk i am looking into zeromq as it has some nice features (queues, pairs), but im not sure i want to keep 2 separate subsystems for sharing data among processes
[13:37] sustrik anyway, it doesn't scale, so no such thing with 0mq
[13:37] kristsk even with local transport ?
[13:38] sustrik the point is that the code written for multiple threads can be scaled to multiple processes or boxes
[13:38] sustrik once you start passing FDs around, you break the scalability
[13:39] kristsk too bad :/
[13:39] stockMQ Hi .. where can i find documentation for Forwarder and Streamer.. I am working on VC++
[13:49] stockMQ All i found was this
[13:49] stockMQ
[14:01] mikko sustrik: yes, please
[14:02] mikko it should work on all of the daily build platforms after that
[14:03] mikko well, all platforms
[14:07] CIA-21 zeromq2: 03Mikko Koppanen 07master * r908b39b 10/ (
[14:07] CIA-21 zeromq2: m4_esyscmd_s doesnt seem to be portable across different systems
[14:07] CIA-21 zeromq2: Signed-off-by: Mikko Koppanen <> -
[14:08] mikko thanks
[14:08] mikko hmm, so now we have proved that remote build slaves can work
[14:08] sustrik yes
[14:09] sustrik the question now is whether we can get more people to maintain slaves
[14:09] mikko yes, sparc would be nice. possibly hp-ux, aix etc
[14:09] sustrik you should maybe write an email about the new feature, so that people know there's such an option
[14:14] zchrish Has anyone tried to compile zeromq on mingw; I want to use zeromq with Qt.
[14:15] sustrik i recall there's a 0mq/qt project on labs page
[14:16] sustrik
[14:17] zchrish Oh, sorry. I meant compiling zeromq on Windows using mingw.
[14:18] mikko zchrish: yes
[14:18] mikko
[14:18] zchrish Oh, great. Thanks.
[14:19] mikko
[14:19] mikko you can also get a snapshot from there
[14:20] ianbarber mikko: are there any plans to do rpm and deb builds as part of your build empire at any point?
[14:21] ianbarber it seems like you have to run the make dist (i guess?) to generate the spec files etc. properly, so it seems build-y
[14:21] mikko yes, debian would be easy as the build master is debian
[14:21] mikko rpm would require setting up centos/rhel build slave
[14:22] ianbarber really? i mean, you could run rpmbuild on debian ok
[14:24] mikko the dependencies don't come out correctly
[14:24] ianbarber i guess the only difficulty would be rpmbuild complaining about build dependencies on compile, but you should be able to force ignore that, as the build reqs will be there
[14:24] mikko thats not a problem
[14:24] mikko the problem is run-time deps
[14:25] ianbarber verifying them, or an actual problem during the build?
[14:26] ianbarber probably easiest to have the actual centos env though i guess
[14:28] mikko if you build on debian you depend on the libraries present on the host system
[14:28] mikko which are more than likely not present in RHEL/Centos
[14:28] mikko also library versions you link against etc
[14:29] ianbarber oh, i get it. yeah, that's a pain. could build the SRPMS automatically, but yeah, would be easier to have a sep slave
[14:31] sustrik still, the idea of building packages on remote boxes is quite nice
[14:32] sustrik anyone who wants to have fresh packages available could volunteer to run the build on his box
[14:32] sustrik this way we could get quite a nice coverage of different CPUs/OSes
[14:32] kristsk how bout vms ?
[14:33] sustrik are you running vms?
[14:33] sustrik vms is a problem as it's not an unix
[14:33] kristsk uh huh, i ment, VM's not VMS
[14:33] sustrik can you simulate a different CPU in a VM?
[14:34] kristsk probably not :/
[14:36] Guthur talking about builds...I went and broke the clrzmq2 build again...
[14:37] Guthur Visual Studios Solutions are really beginning to bug me
[14:39] sustrik heh
[14:40] mikko Guthur: why dont you use msbuild?
[14:40] mikko kristsk: we are running different operating systems in VMs
[14:41] Guthur mikko: you know I asked myself the same question this morning, hehe
[14:41] mikko kristsk: linux, windows, solaris, freebsd but they are all x86
[14:41] kristsk yeah but sparcish things prolly wont run in vm on x86
[14:41] Guthur mikko: but I do want to make sure that clrzmq2 can be used easily from MSVS
[14:42] mikko not sure if qemu can emulate processors
[14:42] Guthur clrzmq2 will experience more resistance if it doesn't work well from visual studio
[14:42] Guthur IMO
[14:46] sustrik Guthur: +1
[14:48] mikko hi neale1
[14:49] mikko everything seems to be up and running and the latest rounds of builds succeeded
[14:50] neale1 Hi Mikko. Yes, we're building on bigger processor than my little emulator box. It's about 3 models older than the current generation. It's on a system with 700 virtual machines so it's being constrained so it "plays well with others". But 7 mins is pretty good turnaround.
[14:50] mikko neale1: yeah, definitely. thanks for your hard work
[14:51] neale1 No worries
[14:51] mikko i'll send an email out today that mentions the new system z build machine if you don't mind?
[14:51] neale1 Sure
[14:52] neale1 The machine it's running on is part of IBM's OSDL facility running at Marist College in mid-state new York.
[14:52] neale1 I'm the s390x maintainer of mono so that's where I do my builds for it
[15:05] sustrik hm, the people/organizations who participate in build system should be mentioned somewhere so that they get the credit...
[15:12] ianbarber with pub/sub, you can have multiple subs connected to a pub, multiple pubs connected to a sub, but not multiple pubs connected to multiple subs right? does that include multicast transports?
[15:13] mikko stutter: i need to create a wiki page detailing the build system
[15:14] sustrik multicast treats the netowrk swtich (hardware) as a device
[15:14] mikko and organisations / people who participate in it
[15:14] stutter mikko: did you mean to tab-complete someone else/
[15:14] mikko yes
[15:14] stutter :P
[15:14] mikko i meant to tab complete sustrik
[15:14] stutter i figured
[15:14] mikko sorry about that
[15:14] sustrik mikko: ok
[15:14] mikko ill try to have time today, ill put a reminder
[15:15] sustrik ianbarber: so both pubs and subs "connect" to the switch
[15:15] sustrik which then forwards any message from any pub to all subs
[15:15] ianbarber that makes sense, thanks
[15:16] mikko now when i look at ian's message about openpgm i think steven sent me an email about it
[15:16] mikko didnt have time to look further
[15:16] sustrik mikko: nice, i think if we made it reasonably visible we can get more participants
[15:16] mikko "Building ZeroMQ I finally get through to the same assertion, and wondering through the code I find the window size is being calculated with options.rate in bytes not kilobytes. So add a × 1024 for both rxw_sqns and txw_sqns."
[15:16] mikko was the relevant part from his email
[15:17] mikko but adding 1024 x didn't solve the issue for me and didn't really have time to look it further
[15:17] sustrik i tried to debug it, but here is seemed that setting the port have failed rather than rate...
[15:17] sustrik strange
[15:22] ianbarber mikko: there is a comment to that effects, // Data rate is in [B/s]. options.rate is in [kb/s].
[15:23] ianbarber and it is possible I was hitting the assert due to failing pgm_setsockopt (sock, IPPROTO_PGM, PGM_TXW_SQNS,
[15:23] ianbarber &txw_sqns, sizeof (txw_sqns))
[15:28] ianbarber yep, that's it
[15:28] ianbarber if txw_sqns is 0, it asserts
[15:30] ianbarber pmg_max_tpdu is 1500
[15:30] ianbarber the sqns is options.recovery_ivl * options.rate /
[15:30] ianbarber txw_max_tpdu
[15:31] ianbarber and recovery_ivl defaults to 10
[15:31] ianbarber so at 149 or below, it comes out as int 0
[15:31] ianbarber which blows up
[15:31] ianbarber so i think that for this calculation options.rate should be multiplied by 1024
[15:32] ianbarber or zmq:txw_max_tpdu appropriately reduced, but that seems more likely to have knock ons
[15:33] Guthur is there documentation of ZeroMQs underlying architecture, or is it just the source code?
[15:38] sustrik ianbarber: that's useful
[15:38] sustrik let me see whether it can be fixed
[15:38] ianbarber just tested a multiplying it by 1024 and it works fine
[15:38] ianbarber i can send a patch if useful (it's like 4 lines, so maybe not worth it)
[15:39] sustrik you mean patch that mutliples the value?
[15:39] ianbarber yeah
[15:42] sustrik hm, it seems it should be multiplied by 8 rather than 1024
[15:43] sustrik tpdu is in bytes
[15:43] sustrik rate is in bits
[15:43] ianbarber ah, you're right
[15:43] ianbarber just fired the patch on mail as well, so you could see what i meant, but yeah, it should be 8
[15:44] ianbarber actually, should it?
[15:44] sustrik the calculation is kind of strange
[15:44] ianbarber just looking, the comment does mention bytes to kb
[15:44] sustrik let us decypher what's going on there...
[15:44] ianbarber are you seeing bits to bytes somewhere?
[15:44] sustrik nope
[15:45] stimpie Guthur, all I could find is the source code.
[15:45] sustrik we are speaking about pgm_socket.cpp:201, right?
[15:45] ianbarber yeah, or the equivalent bit in sender
[15:46] sustrik options.recovery_ivl_msec >= 0 ?
[15:46] sustrik options.recovery_ivl_msec * options.rate /
[15:46] sustrik (1000 * rxw_max_tpdu) :
[15:46] sustrik options.recovery_ivl * options.rate /
[15:46] sustrik rxw_max_tpdu
[15:46] ianbarber pgm_max_tpdu is 1500, which looks like bytes
[15:46] sustrik yes, it's bytes
[15:46] sustrik so, first of all, there are 2 options
[15:46] sustrik recovery_ivl and recovery_ivl_msec
[15:47] sustrik the former is in secs, the latter is in msecs
[15:47] sustrik if the latter is set to 0 (default), former value is used
[15:47] ianbarber zmq_rate says kilobits per second
[15:48] ianbarber the msec/sec is dealt with in the * 1000
[15:48] ianbarber so i think we need * 128
[15:48] ianbarber to go from kilobits to bytes
[15:48] sustrik i haven't got that far yet :)
[15:50] ianbarber :)
[15:53] sustrik recovery_ivl * rate * max_tpdu * 1000 / 8
[15:54] sustrik ianbarber: can you double check that?
[15:54] sustrik sorry
[15:55] sustrik recovery_ivl * rate / max_tpdu * 1000 / 8
[15:56] ianbarber was about to say :)
[15:57] sustrik ok, now the ordering of the terms
[15:57] sustrik the computation is done in ints
[15:57] sustrik so we have to beware rounding down to zero
[15:57] ianbarber not sure about that one still
[15:57] sustrik what's wrong?
[15:57] ianbarber 10 * 100 / 1500 * 1000 / 8
[15:57] ianbarber that's with the default numbers plugged in
[15:59] sustrik that equals in ~83 packets
[16:00] ianbarber of course, this is for the ivl_msec case, so it would be 10000 or thereabouts on the left
[16:00] ianbarber yeah, that works then
[16:01] ianbarber though i don't get your 83 - that looks like about 5 packets to me. (i am likely missing something though)
[16:02] sustrik <ianbarber> 10 * 100 / 1500 * 1000 / 8
[16:02] sustrik = 83
[16:03] ianbarber yep, of course. i shoudn't try and do that in my head :)
[16:03] ianbarber yeah, that looks good
[16:03] sustrik ok, now the ordering...
[16:04] sustrik i think we can mutliply all multiplicants without overflowing the int
[16:05] sustrik hm
[16:05] sustrik with 1Gb rate we can overflow a 32-bit integer
[16:05] sustrik maybe the computation should be done in int64_t
[16:06] sustrik and then cast to int
[16:07] sustrik what unit is rxw_sqns meant to be in
[16:08] sustrik number of PGM packets?
[16:08] ianbarber sequence numbers
[16:08] ianbarber yeah, packets basically
[16:08] sustrik so the computation is completely wrong now
[16:09] ianbarber there is also a limit to how big it is
[16:09] sustrik it's 125x less than it should be :|
[16:09] ianbarber spec says no greater than half the sequence number space less one
[16:09] ianbarber though I suspect the space is pretty huge :)
[16:09] sustrik yeah, something like 2billion
[16:09] sustrik irrelevant here
[16:09] ianbarber yeah
[16:10] sustrik what i don't get it how it can work for anybody
[16:10] ianbarber well, it works if you have a high enough rate
[16:10] sustrik the buffers allocated now are 125x smaller than they should be
[16:11] ianbarber you just get a small sqns range, so it'll effectively get a lot less throughput than it should i think
[16:11] sustrik well, yes, but that means you recovery interval is 125x shorter
[16:11] ianbarber yeah, so less reliable
[16:11] sustrik strange
[16:11] ianbarber but reliability is pretty good on networks - most people are unlikely to hit it unless they're dropping packets
[16:11] sustrik maybe
[16:12] sustrik but if we fix it now
[16:12] sustrik prople's buffers will grow 126x
[16:12] sustrik 125x
[16:12] sustrik and possibly get out of memory
[16:12] sustrik :(
[16:12] ianbarber hmm
[16:12] mikko at the moment it asserts (?)
[16:12] sustrik mikko: it looks like the computation of pgm buffer size is wrong
[16:12] mikko just came back
[16:13] mikko i think that is something that should go to 2.1.x
[16:13] sustrik yielding buffers 125x smaller than they should be
[16:13] ianbarber not sure, txw_seqns does interact with txw_secs
[16:13] ianbarber so perhaps the effect isn't as large as it might be, of increasing the size
[16:13] sustrik i think we need to consult steven before changing it
[16:14] sustrik he doesn't seem to be online now
[16:14] ianbarber yeah, he's the best person to comment
[16:14] sustrik ianbarber: can you drop him a note on the mailing list?
[16:14] ianbarber sure thing
[16:14] sustrik thanks
[16:28] Guthur I was afraid it was only the source code as documentation
[16:31] sustrik Guthur: what exactly do you need?
[16:39] Guthur sustrik: understanding, hehe
[16:39] mikko neale1: on the build page should i list "Contributed by: IBM" or something else
[16:40] Guthur I was hoping to provide patches, but there is a lot to understand
[16:40] sustrik i've started writing somrthing:
[16:40] sustrik
[16:40] Guthur Specifically the duplicate identity issue
[16:40] sustrik but never got too far
[16:40] sustrik in any case, feel free to ask on irc
[16:41] Guthur in general I think it would make the learning curve for new contributors a little less steep
[16:41] sustrik definitely
[16:41] ianbarber that duplicate identity issue seems tricky
[16:42] sustrik the only problem is someone has to write it down and maintain it as the development goes on
[16:44] Guthur ianbarber: My main issue with it is the fact it crashes the server side
[16:44] Guthur clients crashing a server is not good
[16:44] ianbarber yep, i agree
[16:44] ianbarber i had the same reaction
[16:52] ianbarber hmm
[16:52] ianbarber sessions_sync.lock ();
[16:52] ianbarber bool registered = sessions.insert (
[16:52] ianbarber sessions_t::value_type (name_, session_)).second;
[16:52] ianbarber sessions_sync.unlock ();
[16:52] ianbarber return registered;
[16:52] ianbarber just google, the first ref I found for stl map said insert would return the inserted thing, or an operator
[16:52] ianbarber not a bool
[16:53] ianbarber so the insert wouldn't report a failure, even if it had encountered a collision?
[16:53] ianbarber sure sustrik will correct me in short order :)
[17:01] mikko
[17:01] mikko needs information about contributors as well
[17:03] ianbarber surely put yourself on there under contributors
[17:04] ianbarber others can add themselves I guess would be easiest, it is a wiki
[17:25] mikko crlzmq2 fails to build atm
[17:30] Guthur mikko: yeah sorry about that
[17:30] Guthur I know the issue, but can not fix at the moment, at work
[17:31] mikko thats fine
[17:31] mikko just making sure that it's known
[17:31] neale1 mikko: There's a mono environment on the same s390x build server if you want to exploit that. What level of mono do you require? It's 2.7.1 atm, but can be brought up to current git head if required
[17:31] Guthur the solution file is the source of the problem
[17:32] mikko i got mono 2.6.7
[17:32] mikko is there a difference with mono across architectures?
[17:32] Guthur I should be able to push a fix in a couple of hours
[17:33] neale1 mikko: Shouldn't be. I build from same base, just have s390x jit to do bytecode to s390x machine code xlate
[17:33] mikko neale1: should i mention that "Linux s390x build slave was contributed by IBM" ?
[17:33] neale1 No, by Marist College would be better
[17:33] mikko neale1: i could add a mono build for clrzmq and clrzmq2
[17:34] mikko neale1: sure, will add that
[17:34] neale1 Yes
[17:34] mikko do you want url to anywhere?
[17:35] neale1 I'll ask the Marist folks
[17:35] mikko neale1:
[17:35] mikko you can edit the page to add the link
[17:37] neale1 mikko: Tks
[17:38] ianbarber ah, there's a .second on that sessions code, so it's all good. Stepping through it looks like register session is only being called once, for some odd reason
[17:41] neale1 mikko: Do I have a wikidot id to update that page?
[17:42] mikko neale1: i think you can create one
[17:42] mikko neale1: im not sure if you have one already :)
[17:44] neale1 mikko: Cld u just add this
[17:44] mikko neale1: will do
[17:45] mikko neale1: done
[17:54] sustrik ianbarber: insert returns pair, the code selects the second part of the pair, which is bool
[17:55] ianbarber sustrik: spotted that, seems correct
[17:55] ianbarber weird thing I had when debugging the dual identities thing is that it only seems to be hit once
[17:55] ianbarber so I suspect the problem is further up the chain, in attach or something like that
[17:55] sustrik ianbarber: yes, possibly
[17:57] sustrik where exactly does it crash now?
[17:57] ianbarber Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:182)
[17:58] ianbarber i think that's just because if it gets a send fail it just tries to increase buffer size (until it can't)
[17:58] sustrik ianbarber: that has nothing to do with duplicate identities imo
[17:59] sustrik well, unless duplicate identities cause an infinite or extremely fast generation of commands
[17:59] sustrik hm, the latter may be the case...
[18:01] ianbarber sustrik: i don't get the crash on connect, only when I attempt to send something to connected peers
[18:01] ianbarber so it may be something of that nature
[18:02] rook- hello? I have a noob question if someone has a few minutes
[18:02] sustrik inabarder: interesting
[18:02] sustrik rook-: shoot
[18:03] sustrik ianbarber: it looks like large amount of commands is generated for some reason
[18:03] rook- I'm looking for a messaging system to support a distributed "eventually consistent" cloud architecture...
[18:03] sustrik you need to find out why it is so
[18:03] sustrik check the send_command() function to hook into command passing mechanism
[18:03] rook- all the mq's I've found seem to support 2 models: durable queues and pub/sub (fanout)
[18:04] rook- but nothing seems to support publishing blindly to n-subscribers and guaranteeing that each subscriber receives 1 and only 1 copy of the message
[18:04] sustrik it's impossible
[18:05] sustrik slow consumers are the problem
[18:05] rook- by blindly, I mean that the publisher just writes 1 message and all subs who've scribed (whether they're connected or not) eventually get it
[18:05] rook- it looks to me like a pub-sub with a replay service or something similar would solve the problem, but haven't seen anything like it
[18:06] sustrik you would need infinite memory to hold the messages for slow/stuck consumers
[18:06] rook- infinite/disk
[18:06] rook- yea
[18:07] sustrik yes, so it can't be done
[18:07] rook- um.. I have a hard time believing that...
[18:08] rook- since so many services are using eventually consistent frameworks of their own devising
[18:08] sustrik shrug
[18:09] rook- I mean, there may be the need for a high water mark/"beyond this point the stuck consumer needs help with rebuilding"
[18:09] rook- but "can't be done" seems improbable
[18:10] sustrik there's an alternative, yes
[18:10] sustrik you can block the sender
[18:10] sustrik but that means that the slow/stuck consumer can stop all the data distribution
[18:10] sustrik even to the properly working consumers
[18:11] rook- yea that'd suck :D
[18:11] sustrik so it's either-or
[18:11] sustrik either drop messages when there no space to store them
[18:11] sustrik or block the whole distribution
[18:12] rook- hrm...
[18:13] rook- I think in our env. a rolling window would potentially be ok... i.e. if the consumer isn't around for a given period of time, then remediation is required for it to rejoin and resync
[18:13] sustrik ack
[18:13] rook- true
[18:13] sustrik that's the standard model for pub-sub
[18:14] rook- sorry?
[18:14] rook- I thought that pub/sub was limited by immediately connectivity and deliverability
[18:14] rook- i.e. if consumer isn't there to receive message, or isn't accepting the message, it may miss out
[18:16] rook- is my impression of pub/sub overly simplified?
[18:16] rook- / have I missed something?
[18:18] ianbarber rook: that's how pubsub works in 0MQ, without identities. If you have identities set then it'll try and buffer up messages for them.
[18:19] sustrik rook-: yes, you are right
[18:19] sustrik that's the general model
[18:19] sustrik however, different solution add some amount of reliability
[18:20] sustrik for example, PGM retransmits the missing packets if they are still in the reliability window
[18:20] sustrik etc.
[18:21] Guthur is the HWM per durable subscriber or does it effect all globally?
[18:21] sustrik it's per peer
[18:22] Guthur is there no way to clear it, besides the peer connecting?
[18:23] rook- so I can, in 0mq, have a publisher, who just publishes events, and I can have n-consumers that each, with some reliability, receive every message and the publisher doesn't need to know or be configured with what the consumers are, how many of them there are, etc.
[18:23] rook- ?
[18:27] rook- ?
[18:28] rook- sustrik?
[18:31] sustrik Guthur: no
[18:31] sustrik rook-: yes
[18:32] rook- cool!
[18:32] rook- is the size of the window configurable? I clearly have more reading to do, url reference handy? Pointers?
[18:32] rook- (thanks for your help btw)
[18:33] Guthur rook-:
[18:33] Guthur you may have seen it already
[18:33] sustrik or check the man pages
[18:34] rook- yea saw the docs - to be honest, I find them... difficult.. very friendly and jovial but not as... concise as typical docs ;)
[18:34] sustrik there's reference
[18:34] sustrik that's consice
[18:35] rook- ok I'll poke about
[18:35] rook- now that I know that 0mq handles it
[18:35] rook- I feel more justified in digging harder for the info :)
[18:35] rook- thanks for the help - off to read
[18:35] rook- :)
[19:45] mikko good evening
[19:58] neale1 mikko: I am getting an article on 0MQ printed in the April/May edition of z/Journal (a journal for IBM mainframers). It's an introduction to 0MQ for a group more familiar with things like MQSeries
[19:59] mikko nice
[19:59] mikko i read about ibm system z
[19:59] mikko seems like pretty serious computing system
[20:00] neale1 Yep, there's a major conference in Anaheim later this month which I'm going to. I wasn't able to schedule a 0MQ talk this time but I'll be trying to get one on the agenda for August in Orlando. As I'm on the program committee, it should get on
[20:02] mikko what is the conference in Orlando?
[20:02] neale1 Same as the one in Anaheim - they hold 2 yearly. It's called "SHARE"
[20:02] mikko is it mainframy conference?
[20:03] neale1 Yep
[20:03] neale1
[20:05] mikko seems like a big conference
[20:06] mikko i've worked with some of the technologies mentioned in the talks
[20:06] mikko mainly on integrating them with web
[20:07] sustrik the introduction to 0mq for mq series users is something that's badly missing
[20:08] sustrik neale1: would your article be publicly available?
[20:08] neale1 sustrik: Yes, z/Journal has it online as well as in a physical magazine.
[20:08] sustrik great!
[20:09] neale1 sustrik: I'm not doing a if you're using WebSphere MQ and you do x, then on 0MQ you do y
[20:09] sustrik understood
[20:09] neale1 It's a straight introduction trying to put it in context of where it would be useful.
[20:09] sustrik i mean, the problem is that if you have "corporate messaging" experience
[20:09] neale1 A lot of folks want to avoid the complexity and cost of MQ
[20:09] sustrik you have quite lot of expectations...
[20:09] neale1 yes
[20:10] sustrik that don't apply to 0mq
[20:10] sustrik so i think the point of the article is to explain the difference between mq series and 0mq
[20:10] sustrik like "what you should expect from 0mq"
[20:10] sustrik right?
[20:12] sustrik in any case, that kind of introduction is missing so far
[20:14] neale1 At the moment it's even more basic than that. I want to aim at the sockets programmers who are dreading having to go to a full brokered option.
[20:14] sustrik ah, that's easier, yes
[20:15] neale1 The linux on system z community is about 10 years old now, but most of its growth has been in the past 5. We're seeing a mix of traditional sysadmin/programmer types from the UNIX world combining with the old hands from the traditional mainframe set.
[20:16] neale1 I want to conclude the article with a set of questions people can ask when they want to select a technology. It might be useful to open the floor here to suggestions.
[20:16] sustrik definitely
[20:17] sustrik it's rather hard question to ask though
[20:17] sustrik as it's not only a technical question
[20:18] sustrik and depends on whether there are people skilled in the technology, whether company have licenses bought already etc.
[20:18] neale1 In my own case I discovered 0MQ when I was tasked with a project that needed to have several "collector" clients receive raw data, package it, and send it to a sink which would process the collected data. In addition I wanted to add collectors at will, ensure delivery, receive configuration information from the server. And do this in a minimal footprint with little affect on overall CPU on the systems where those collectors are ru
[20:18] neale1 Oh, and not require the customer install a full-fledged broker type system
[20:19] sustrik right, then using 0mq makes perfect sense
[20:19] sustrik when you spoke about mainframes i though of integrating legacy apps
[20:19] sustrik rather than writing new apps
[20:20] sustrik but linux subsystem is probably not that legacy-oriented
[20:23] neale1 Actually, IBM has spent a lot of time/money putting hooks into its "legacy" systems to allow them to integrate with Linux. One thing we'd need to do on the legacy side though is to port the code (or provide a minimal set of things like the tcp: transport)
[20:23] sustrik right
[20:24] sustrik anyway, if you are going to have a lecture about 0mq
[20:24] neale1 Within a mainframe you can talk between the legacy systems like z/OS and a z/Linux system at memory-to-memory speeds even though it appears to be a layer2/layer3 network
[20:24] sustrik announce it on (see top right)
[20:24] neale1 Will do
[20:24] sustrik neale1: are you familiar with z/OS as well?
[20:26] neale1 sustrik: Yes as well as z/VSE and z/VM. The latter two I've been using for *many* years now. I have only a few years experience with z/OS.
[20:26] neale1 0MQ could easily be ported to what z/OS calls Unix System Services (USS). It has a full UNIX-branded set of APIs.
[20:27] sustrik wow
[20:27] sustrik thay have a posix-y interface, right?
[20:27] neale1 Yes, fully posix-compliant
[20:27] sustrik that's need
[20:27] sustrik neat
[20:28] sustrik would be nice to have the port one day
[20:28] sustrik but i assume the build system would be the main problem
[20:28] neale1 THey have most of the common tools too: autoconf/automake/libtool - at what level I'm not sure
[20:28] sustrik even better
[20:29] sustrik there's a 0mq port to VMS
[20:29] sustrik there's a POSIX interface available
[20:29] neale1 Yes, I saw that.
[20:29] sustrik but the problem is that there's no unix shell
[20:29] sustrik so the autotools doesn't work
[20:29] neale1 Ugh
[20:30] neale1 On USS they have a korn shell but you can also run bash
[20:30] sustrik sounds good
[20:30] Guthur is there a zeromq irc log?
[20:30] sustrik travlr used to log the conversation
[20:30] neale1 Same goes with z/VM
[20:30] sustrik i see
[20:31] Guthur ah ha, and he still does
[20:31] Guthur cheers sustrik
[20:31] sustrik neale1: maybe give it a shot if you have some spare time
[20:31] Guthur oh it's a bit behind though
[20:32] neale1 I'm going to be installing an upgrade of our z/OS system in the near future, so I'll make sure the USS environment and tools are there.
[20:32] sustrik the logger used to hang around called 'zmqircd'
[20:32] sustrik but it doesn't seem to be there for some time already
[20:33] sustrik neale1: nice
[20:33] mikko i got logs
[20:33] mikko since last Feb i think
[20:34] Guthur it only goes up until Feb 4 2011 or something
[20:34] Guthur I suppose it just needs the newest ones placed on it
[20:36] mikko it should be up to date
[20:37] Guthur here
[20:37] Guthur correct?
[20:37] mikko
[20:37] mikko mine's here
[20:37] mikko it just links to my normal irc log from this channel
[21:23] Guthur ok, hopefully that's the clrzmq2 build fixed
[22:28] Guthur just whipped together this asyncReturn device
[22:29] Guthur the return call could be made with a push socket supplying the appropriate address
[22:30] Guthur As an aside I have just moved the RunningLoop into the base class
[22:31] Guthur Just a simple thing, but thought I would share anyway