[Time] Name | Message |
[00:00] Guthur
|
what happens if you are polling and then subsequently close the socket?
|
[01:27] Guthur
|
mikko, the build server is dead?
|
[04:39] kabs
|
Hello, if subscriber in pub-sub model goes down or restart we have the option to queue its data at publisher side by using HWM or SWAP but how can we make sure our system doesn't fail if publisher goes down?
|
[05:03] jhawk28
|
kabs, you have to persist the data in your program
|
[05:04] jhawk28
|
and determine what is acceptable for restarts
|
[05:04] jhawk28
|
is it ok to send data to the subscriber again
|
[05:06] kabs
|
Not sure as of now but I am looking for the options
|
[05:07] jhawk28
|
if you need strong durability, and are not concerned so much with performance, AMQP may be a better fit
|
[05:08] jhawk28
|
https://github.com/rabbitmq/rmq-0mq
|
[05:09] kabs
|
jhawk28:Can you elaborate on data persistence here, since if pub is crashed all it's buffers( I/O, network buffers ) will be gone, how can we have persistence of those buffers ??
|
[05:09] jhawk28
|
it would be external to the ZMQ library
|
[05:10] jhawk28
|
in the application space
|
[05:11] jhawk28
|
so you could just log data as it was published
|
[05:12] jhawk28
|
then when the publisher was restarted, it would just replay the log
|
[05:13] jhawk28
|
you could sequence it and use a periodic ack using a pull socket to know how far back, but then you need to keep track of acks for all subscribers (which slows performance)
|
[05:14] jhawk28
|
it creates durability, but has a penalty for performance in addition to breaking the desired contract of sending a message once and only once
|
[05:16] kabs
|
jhawk28: it seems there is no ideal system, some system has some penalty , etc ...
|
[05:16] jhawk28
|
correct. durable pub is hard
|
[05:16] kabs
|
jhawk28: so this pull and ack is to maintain identity
|
[05:17] jhawk28
|
http://sna-projects.com/kafka/ is a full blown system just for it
|
[05:17] jhawk28
|
you can get ideas from http://sna-projects.com/kafka/design.php
|
[05:20] kabs
|
jhawk28: thank you for your input, will see it , just gave it a glance and found one beautiful line "Don't fear the filesystem!" :)
|
[09:37] ianbarber
|
hi all, anyone got any PGM ideas about this assertion:
|
[09:37] ianbarber
|
Assertion failed: rc == 0 (connect_session.cpp:82)
|
[09:37] ianbarber
|
from a zmq_connect (pub, "epgm://;239.192.0.1:7601");
|
[09:38] ianbarber
|
as far as I can see it's being thrown due to a failure in pgm_socket.cpp, possible if (!pgm_setsockopt (sock, IPPROTO_PGM, PGM_SEND_ONLY,
|
[09:38] ianbarber
|
&send_only, sizeof (send_only))
|
[09:38] ianbarber
|
due to that
|
[09:40] guido_g
|
you dont't specify an interface, check that the loopback is *not* used
|
[09:40] guido_g
|
loopback is nit multicast capable
|
[09:41] guido_g
|
s/nit/not/
|
[09:43] ianbarber
|
ah, good point. just tried with eth0, same result
|
[09:44] ianbarber
|
is there a way of telling whether the interface is capable, or enable/disabling it? i am on a VMWare VM, so it is entirely possible there is something odd about the network interface
|
[09:44] guido_g
|
ifconfig should tell
|
[09:44] ianbarber
|
yeah, MULTICAST is in there
|
[09:44] ianbarber
|
hmm
|
[09:46] ianbarber
|
i get the same thing with PGM as well as epgm (not massively suprisingly I'm sure)
|
[09:56] mikko
|
ianbarber: increase ZMQ_RATE
|
[09:56] mikko
|
seems to fix the symptoms
|
[09:56] mikko
|
not sure what the actual issue is
|
[09:58] ianbarber
|
ah, right, let me try that
|
[09:58] ianbarber
|
what's rate default to then?
|
[09:59] guido_g
|
100kbit i think
|
[10:01] suzan_shakya
|
Hi all, is there any method for listening on 2 sockets other than zmq.Poller()
|
[10:02] ianbarber
|
awesome, that seemed to work (once I remember to define it as an int64_t rather than an int)
|
[10:02] ianbarber
|
thanks mikko!
|
[10:03] ianbarber
|
(and thanks guido for the help as well :))
|
[10:15] sustrik
|
ianbarber: can you report the problem on the mailing list so that steve-o can have a look at it?
|
[10:16] sustrik
|
suzan_shakya: you mean two 0mq sockets or two peers?
|
[10:17] ianbarber
|
sustrik: sure thing
|
[10:22] mikko
|
http://build.valokuva.org/job/ZeroMQ2-core-master-GCC-linux_s390x/11/console
|
[10:22] mikko
|
s390x building
|
[10:23] mikko
|
and fails
|
[10:25] ianbarber
|
oh, cool that that's on the build system, even if it is failing :)
|
[10:26] sustrik
|
the best thing is that's it proves that distributed build system is doable :)
|
[10:27] sustrik
|
configure.in:9: error: possibly undefined macro: m4_esyscmd_s
|
[10:27] mikko
|
yeah, just trying to figure out what system it is
|
[10:27] mikko
|
might very well be a RHEL
|
[10:27] sustrik
|
that's the last patch that was made to the build system
|
[10:27] mikko
|
yes
|
[10:27] mikko
|
m4_esyscmd_s is present in autoconf 2.61
|
[10:28] mikko
|
which is the minimum requirement
|
[10:28] mikko
|
not sure what version that system runs
|
[10:28] sustrik
|
aha
|
[10:28] mikko
|
encouraging, no google results for "possibly undefined macro: m4_esyscmd_s"
|
[10:30] mikko
|
works in 2.68 autoconf
|
[10:30] mikko
|
that system runs 2.63
|
[10:30] mikko
|
very strange
|
[10:33] mikko
|
hmm, sustrik i think it's better to revert the patch
|
[10:33] jugg
|
:(
|
[10:33] sustrik
|
jugg: what was the problem it solved?
|
[10:33] mikko
|
i think the behaviour of m4_esyscmd_s is a lot more inconsistent over different platofrms
|
[10:34] jugg
|
debian 6 systems failing
|
[10:34] mikko
|
jugg: does it fail or warn?
|
[10:34] mikko
|
if it fails then we need to put more attention to this
|
[10:35] mikko
|
(the daily build cluster master is debian 6.0)
|
[10:35] sustrik
|
configure.in:9: warning: AC_INIT: not a literal: m4_esyscmd([./version.sh | tr -
|
[10:35] sustrik
|
d '\n'])
|
[10:35] jugg
|
looks like a warning, I'm not sure if ./configure works after, I did not try.
|
[10:35] jugg
|
yes, what sustrik said
|
[10:36] sustrik
|
jugg: can you check whether the package builds ok in spite of the warning?
|
[10:36] jugg
|
one moment...
|
[10:36] mikko
|
m4_esyscmd_s seems to present on all platforms which had daily builds but not on the new s390x
|
[10:38] mikko
|
another option would be to use 'echo -n' instead of 'echo' in version.sh and revert to m4_esyscmd
|
[10:42] jugg
|
everything seems to build fine without that fix
|
[10:43] jugg
|
without the fix, ./autogen.sh produces that warning 6 times.
|
[10:44] mikko
|
jugg: what about AC_INIT([zeromq],[m4_esyscmd([./version.sh])],[zeromq-dev@lists.zeromq.org])
|
[10:44] mikko
|
and change version.sh to use 'echo -n ' at the end
|
[10:46] mikko
|
echo -n is not portable
|
[10:47] jugg
|
it worked however...
|
[10:50] mikko
|
bash shell most probably
|
[10:50] mikko
|
ksh:
|
[10:50] mikko
|
$ echo -n hi
|
[10:50] mikko
|
-n hi
|
[10:53] mikko
|
hmm, what about if we moved the tr -d '\n' to the version.sh
|
[10:54] mikko
|
that should be as portable as the old way
|
[10:56] mikko
|
jugg: https://gist.github.com/35b8a685a38c0e4a92c5
|
[10:56] mikko
|
can you test that?
|
[11:01] jugg
|
works
|
[11:07] mikko
|
works on s390 as well
|
[11:17] mikko
|
what on earth
|
[11:17] mikko
|
"conftest.c:15: error: 'Syntax' does not name a type
|
[12:14] sustrik
|
mikko: should i apply the patch or not?
|
[13:01] kristsk
|
hello there.
|
[13:01] kristsk
|
is it posssible to send FDs (file descriptors) around with zeromq - if processes reside in same box?
|
[13:10] sustrik
|
FDs are process-specific
|
[13:11] sustrik
|
no way of passing them between processes
|
[13:33] kristsk
|
sustrik - there is a way.
|
[13:34] sustrik
|
?
|
[13:35] kristsk
|
https://github.com/pgte/fugue/wiki/How-Fugue-Works
|
[13:35] kristsk
|
2. Master socket - ...
|
[13:37] sustrik
|
right
|
[13:37] sustrik
|
i almost forgot about that
|
[13:37] kristsk
|
i am looking into zeromq as it has some nice features (queues, pairs), but im not sure i want to keep 2 separate subsystems for sharing data among processes
|
[13:37] sustrik
|
anyway, it doesn't scale, so no such thing with 0mq
|
[13:37] kristsk
|
even with local transport ?
|
[13:38] sustrik
|
the point is that the code written for multiple threads can be scaled to multiple processes or boxes
|
[13:38] sustrik
|
once you start passing FDs around, you break the scalability
|
[13:39] kristsk
|
too bad :/
|
[13:39] stockMQ
|
Hi .. where can i find documentation for Forwarder and Streamer.. I am working on VC++
|
[13:49] stockMQ
|
All i found was this
|
[13:49] stockMQ
|
http://api.zeromq.org/zmq_forwarder.html
|
[14:01] mikko
|
sustrik: yes, please
|
[14:02] mikko
|
it should work on all of the daily build platforms after that
|
[14:03] mikko
|
well, all platforms
|
[14:07] CIA-21
|
zeromq2: 03Mikko Koppanen 07master * r908b39b 10/ (configure.in version.sh):
|
[14:07] CIA-21
|
zeromq2: m4_esyscmd_s doesnt seem to be portable across different systems
|
[14:07] CIA-21
|
zeromq2: Signed-off-by: Mikko Koppanen <mikko.koppanen@gmail.com> - http://bit.ly/fCd8BW
|
[14:08] mikko
|
thanks
|
[14:08] mikko
|
hmm, so now we have proved that remote build slaves can work
|
[14:08] sustrik
|
yes
|
[14:09] sustrik
|
the question now is whether we can get more people to maintain slaves
|
[14:09] mikko
|
yes, sparc would be nice. possibly hp-ux, aix etc
|
[14:09] sustrik
|
you should maybe write an email about the new feature, so that people know there's such an option
|
[14:14] zchrish
|
Has anyone tried to compile zeromq on mingw; I want to use zeromq with Qt.
|
[14:15] sustrik
|
i recall there's a 0mq/qt project on labs page
|
[14:16] sustrik
|
http://labs.wordtothewise.com/zeromqt/
|
[14:17] zchrish
|
Oh, sorry. I meant compiling zeromq on Windows using mingw.
|
[14:18] mikko
|
zchrish: yes
|
[14:18] mikko
|
http://build.valokuva.org/job/ZeroMQ2-core-master_mingw32-win7/
|
[14:18] zchrish
|
Oh, great. Thanks.
|
[14:19] mikko
|
http://snapshot.valokuva.org/
|
[14:19] mikko
|
you can also get a snapshot from there
|
[14:20] ianbarber
|
mikko: are there any plans to do rpm and deb builds as part of your build empire at any point?
|
[14:21] ianbarber
|
it seems like you have to run the make dist (i guess?) to generate the spec files etc. properly, so it seems build-y
|
[14:21] mikko
|
yes, debian would be easy as the build master is debian
|
[14:21] mikko
|
rpm would require setting up centos/rhel build slave
|
[14:22] ianbarber
|
really? i mean, you could run rpmbuild on debian ok
|
[14:24] mikko
|
the dependencies don't come out correctly
|
[14:24] ianbarber
|
i guess the only difficulty would be rpmbuild complaining about build dependencies on compile, but you should be able to force ignore that, as the build reqs will be there
|
[14:24] mikko
|
thats not a problem
|
[14:24] mikko
|
the problem is run-time deps
|
[14:25] ianbarber
|
verifying them, or an actual problem during the build?
|
[14:26] ianbarber
|
probably easiest to have the actual centos env though i guess
|
[14:28] mikko
|
if you build on debian you depend on the libraries present on the host system
|
[14:28] mikko
|
which are more than likely not present in RHEL/Centos
|
[14:28] mikko
|
also library versions you link against etc
|
[14:29] ianbarber
|
oh, i get it. yeah, that's a pain. could build the SRPMS automatically, but yeah, would be easier to have a sep slave
|
[14:31] sustrik
|
still, the idea of building packages on remote boxes is quite nice
|
[14:32] sustrik
|
anyone who wants to have fresh packages available could volunteer to run the build on his box
|
[14:32] sustrik
|
this way we could get quite a nice coverage of different CPUs/OSes
|
[14:32] kristsk
|
how bout vms ?
|
[14:33] sustrik
|
are you running vms?
|
[14:33] sustrik
|
vms is a problem as it's not an unix
|
[14:33] kristsk
|
uh huh, i ment, VM's not VMS
|
[14:33] sustrik
|
can you simulate a different CPU in a VM?
|
[14:34] kristsk
|
probably not :/
|
[14:36] Guthur
|
talking about builds...I went and broke the clrzmq2 build again...
|
[14:37] Guthur
|
Visual Studios Solutions are really beginning to bug me
|
[14:39] sustrik
|
heh
|
[14:40] mikko
|
Guthur: why dont you use msbuild?
|
[14:40] mikko
|
kristsk: we are running different operating systems in VMs
|
[14:41] Guthur
|
mikko: you know I asked myself the same question this morning, hehe
|
[14:41] mikko
|
kristsk: linux, windows, solaris, freebsd but they are all x86
|
[14:41] kristsk
|
yeah but sparcish things prolly wont run in vm on x86
|
[14:41] Guthur
|
mikko: but I do want to make sure that clrzmq2 can be used easily from MSVS
|
[14:42] mikko
|
not sure if qemu can emulate processors
|
[14:42] Guthur
|
clrzmq2 will experience more resistance if it doesn't work well from visual studio
|
[14:42] Guthur
|
IMO
|
[14:46] sustrik
|
Guthur: +1
|
[14:48] mikko
|
hi neale1
|
[14:49] mikko
|
everything seems to be up and running and the latest rounds of builds succeeded
|
[14:50] neale1
|
Hi Mikko. Yes, we're building on bigger processor than my little emulator box. It's about 3 models older than the current generation. It's on a system with 700 virtual machines so it's being constrained so it "plays well with others". But 7 mins is pretty good turnaround.
|
[14:50] mikko
|
neale1: yeah, definitely. thanks for your hard work
|
[14:51] neale1
|
No worries
|
[14:51] mikko
|
i'll send an email out today that mentions the new system z build machine if you don't mind?
|
[14:51] neale1
|
Sure
|
[14:52] neale1
|
The machine it's running on is part of IBM's OSDL facility running at Marist College in mid-state new York.
|
[14:52] neale1
|
I'm the s390x maintainer of mono so that's where I do my builds for it
|
[15:05] sustrik
|
hm, the people/organizations who participate in build system should be mentioned somewhere so that they get the credit...
|
[15:12] ianbarber
|
with pub/sub, you can have multiple subs connected to a pub, multiple pubs connected to a sub, but not multiple pubs connected to multiple subs right? does that include multicast transports?
|
[15:13] mikko
|
stutter: i need to create a wiki page detailing the build system
|
[15:14] sustrik
|
multicast treats the netowrk swtich (hardware) as a device
|
[15:14] mikko
|
and organisations / people who participate in it
|
[15:14] stutter
|
mikko: did you mean to tab-complete someone else/
|
[15:14] mikko
|
yes
|
[15:14] stutter
|
:P
|
[15:14] mikko
|
i meant to tab complete sustrik
|
[15:14] stutter
|
i figured
|
[15:14] mikko
|
sorry about that
|
[15:14] sustrik
|
mikko: ok
|
[15:14] mikko
|
ill try to have time today, ill put a reminder
|
[15:15] sustrik
|
ianbarber: so both pubs and subs "connect" to the switch
|
[15:15] sustrik
|
which then forwards any message from any pub to all subs
|
[15:15] ianbarber
|
that makes sense, thanks
|
[15:16] mikko
|
now when i look at ian's message about openpgm i think steven sent me an email about it
|
[15:16] mikko
|
didnt have time to look further
|
[15:16] sustrik
|
mikko: nice, i think if we made it reasonably visible we can get more participants
|
[15:16] mikko
|
"Building ZeroMQ I finally get through to the same assertion, and wondering through the code I find the window size is being calculated with options.rate in bytes not kilobytes. So add a × 1024 for both rxw_sqns and txw_sqns."
|
[15:16] mikko
|
was the relevant part from his email
|
[15:17] mikko
|
but adding 1024 x didn't solve the issue for me and didn't really have time to look it further
|
[15:17] sustrik
|
i tried to debug it, but here is seemed that setting the port have failed rather than rate...
|
[15:17] sustrik
|
strange
|
[15:22] ianbarber
|
mikko: there is a comment to that effects, // Data rate is in [B/s]. options.rate is in [kb/s].
|
[15:23] ianbarber
|
and it is possible I was hitting the assert due to failing pgm_setsockopt (sock, IPPROTO_PGM, PGM_TXW_SQNS,
|
[15:23] ianbarber
|
&txw_sqns, sizeof (txw_sqns))
|
[15:28] ianbarber
|
yep, that's it
|
[15:28] ianbarber
|
if txw_sqns is 0, it asserts
|
[15:30] ianbarber
|
pmg_max_tpdu is 1500
|
[15:30] ianbarber
|
the sqns is options.recovery_ivl * options.rate /
|
[15:30] ianbarber
|
txw_max_tpdu
|
[15:31] ianbarber
|
and recovery_ivl defaults to 10
|
[15:31] ianbarber
|
so at 149 or below, it comes out as int 0
|
[15:31] ianbarber
|
which blows up
|
[15:31] ianbarber
|
so i think that for this calculation options.rate should be multiplied by 1024
|
[15:32] ianbarber
|
or zmq:txw_max_tpdu appropriately reduced, but that seems more likely to have knock ons
|
[15:33] Guthur
|
is there documentation of ZeroMQs underlying architecture, or is it just the source code?
|
[15:38] sustrik
|
ianbarber: that's useful
|
[15:38] sustrik
|
let me see whether it can be fixed
|
[15:38] ianbarber
|
just tested a multiplying it by 1024 and it works fine
|
[15:38] ianbarber
|
i can send a patch if useful (it's like 4 lines, so maybe not worth it)
|
[15:39] sustrik
|
you mean patch that mutliples the value?
|
[15:39] ianbarber
|
yeah
|
[15:42] sustrik
|
hm, it seems it should be multiplied by 8 rather than 1024
|
[15:43] sustrik
|
tpdu is in bytes
|
[15:43] sustrik
|
rate is in bits
|
[15:43] ianbarber
|
ah, you're right
|
[15:43] ianbarber
|
just fired the patch on mail as well, so you could see what i meant, but yeah, it should be 8
|
[15:44] ianbarber
|
actually, should it?
|
[15:44] sustrik
|
the calculation is kind of strange
|
[15:44] ianbarber
|
just looking, the comment does mention bytes to kb
|
[15:44] sustrik
|
let us decypher what's going on there...
|
[15:44] ianbarber
|
are you seeing bits to bytes somewhere?
|
[15:44] sustrik
|
nope
|
[15:45] stimpie
|
Guthur, all I could find is the source code.
|
[15:45] sustrik
|
we are speaking about pgm_socket.cpp:201, right?
|
[15:45] ianbarber
|
yeah, or the equivalent bit in sender
|
[15:46] sustrik
|
options.recovery_ivl_msec >= 0 ?
|
[15:46] sustrik
|
options.recovery_ivl_msec * options.rate /
|
[15:46] sustrik
|
(1000 * rxw_max_tpdu) :
|
[15:46] sustrik
|
options.recovery_ivl * options.rate /
|
[15:46] sustrik
|
rxw_max_tpdu
|
[15:46] ianbarber
|
pgm_max_tpdu is 1500, which looks like bytes
|
[15:46] sustrik
|
yes, it's bytes
|
[15:46] sustrik
|
so, first of all, there are 2 options
|
[15:46] sustrik
|
recovery_ivl and recovery_ivl_msec
|
[15:47] sustrik
|
the former is in secs, the latter is in msecs
|
[15:47] sustrik
|
if the latter is set to 0 (default), former value is used
|
[15:47] ianbarber
|
zmq_rate says kilobits per second
|
[15:48] ianbarber
|
the msec/sec is dealt with in the * 1000
|
[15:48] ianbarber
|
so i think we need * 128
|
[15:48] ianbarber
|
to go from kilobits to bytes
|
[15:48] sustrik
|
i haven't got that far yet :)
|
[15:50] ianbarber
|
:)
|
[15:53] sustrik
|
recovery_ivl * rate * max_tpdu * 1000 / 8
|
[15:54] sustrik
|
ianbarber: can you double check that?
|
[15:54] sustrik
|
sorry
|
[15:55] sustrik
|
recovery_ivl * rate / max_tpdu * 1000 / 8
|
[15:56] ianbarber
|
was about to say :)
|
[15:57] sustrik
|
ok, now the ordering of the terms
|
[15:57] sustrik
|
the computation is done in ints
|
[15:57] sustrik
|
so we have to beware rounding down to zero
|
[15:57] ianbarber
|
not sure about that one still
|
[15:57] sustrik
|
what's wrong?
|
[15:57] ianbarber
|
10 * 100 / 1500 * 1000 / 8
|
[15:57] ianbarber
|
that's with the default numbers plugged in
|
[15:59] sustrik
|
that equals in ~83 packets
|
[16:00] ianbarber
|
of course, this is for the ivl_msec case, so it would be 10000 or thereabouts on the left
|
[16:00] ianbarber
|
yeah, that works then
|
[16:01] ianbarber
|
though i don't get your 83 - that looks like about 5 packets to me. (i am likely missing something though)
|
[16:02] sustrik
|
<ianbarber> 10 * 100 / 1500 * 1000 / 8
|
[16:02] sustrik
|
= 83
|
[16:03] ianbarber
|
yep, of course. i shoudn't try and do that in my head :)
|
[16:03] ianbarber
|
yeah, that looks good
|
[16:03] sustrik
|
ok, now the ordering...
|
[16:04] sustrik
|
i think we can mutliply all multiplicants without overflowing the int
|
[16:05] sustrik
|
hm
|
[16:05] sustrik
|
with 1Gb rate we can overflow a 32-bit integer
|
[16:05] sustrik
|
maybe the computation should be done in int64_t
|
[16:06] sustrik
|
and then cast to int
|
[16:07] sustrik
|
what unit is rxw_sqns meant to be in
|
[16:08] sustrik
|
number of PGM packets?
|
[16:08] ianbarber
|
sequence numbers
|
[16:08] ianbarber
|
yeah, packets basically
|
[16:08] sustrik
|
so the computation is completely wrong now
|
[16:09] ianbarber
|
there is also a limit to how big it is
|
[16:09] sustrik
|
it's 125x less than it should be :|
|
[16:09] ianbarber
|
spec says no greater than half the sequence number space less one
|
[16:09] ianbarber
|
though I suspect the space is pretty huge :)
|
[16:09] sustrik
|
yeah, something like 2billion
|
[16:09] sustrik
|
irrelevant here
|
[16:09] ianbarber
|
yeah
|
[16:10] sustrik
|
what i don't get it how it can work for anybody
|
[16:10] ianbarber
|
well, it works if you have a high enough rate
|
[16:10] sustrik
|
the buffers allocated now are 125x smaller than they should be
|
[16:11] ianbarber
|
you just get a small sqns range, so it'll effectively get a lot less throughput than it should i think
|
[16:11] sustrik
|
well, yes, but that means you recovery interval is 125x shorter
|
[16:11] ianbarber
|
yeah, so less reliable
|
[16:11] sustrik
|
strange
|
[16:11] ianbarber
|
but reliability is pretty good on networks - most people are unlikely to hit it unless they're dropping packets
|
[16:11] sustrik
|
maybe
|
[16:12] sustrik
|
but if we fix it now
|
[16:12] sustrik
|
prople's buffers will grow 126x
|
[16:12] sustrik
|
125x
|
[16:12] sustrik
|
and possibly get out of memory
|
[16:12] sustrik
|
:(
|
[16:12] ianbarber
|
hmm
|
[16:12] mikko
|
at the moment it asserts (?)
|
[16:12] sustrik
|
mikko: it looks like the computation of pgm buffer size is wrong
|
[16:12] mikko
|
just came back
|
[16:13] mikko
|
i think that is something that should go to 2.1.x
|
[16:13] sustrik
|
yielding buffers 125x smaller than they should be
|
[16:13] ianbarber
|
not sure, txw_seqns does interact with txw_secs
|
[16:13] ianbarber
|
so perhaps the effect isn't as large as it might be, of increasing the size
|
[16:13] sustrik
|
i think we need to consult steven before changing it
|
[16:14] sustrik
|
he doesn't seem to be online now
|
[16:14] ianbarber
|
yeah, he's the best person to comment
|
[16:14] sustrik
|
ianbarber: can you drop him a note on the mailing list?
|
[16:14] ianbarber
|
sure thing
|
[16:14] sustrik
|
thanks
|
[16:28] Guthur
|
I was afraid it was only the source code as documentation
|
[16:31] sustrik
|
Guthur: what exactly do you need?
|
[16:39] Guthur
|
sustrik: understanding, hehe
|
[16:39] mikko
|
neale1: on the build page should i list "Contributed by: IBM" or something else
|
[16:40] Guthur
|
I was hoping to provide patches, but there is a lot to understand
|
[16:40] sustrik
|
i've started writing somrthing:
|
[16:40] sustrik
|
http://www.zeromq.org/whitepapers:architecture
|
[16:40] Guthur
|
Specifically the duplicate identity issue
|
[16:40] sustrik
|
but never got too far
|
[16:40] sustrik
|
in any case, feel free to ask on irc
|
[16:41] Guthur
|
in general I think it would make the learning curve for new contributors a little less steep
|
[16:41] sustrik
|
definitely
|
[16:41] ianbarber
|
that duplicate identity issue seems tricky
|
[16:42] sustrik
|
the only problem is someone has to write it down and maintain it as the development goes on
|
[16:44] Guthur
|
ianbarber: My main issue with it is the fact it crashes the server side
|
[16:44] Guthur
|
clients crashing a server is not good
|
[16:44] ianbarber
|
yep, i agree
|
[16:44] ianbarber
|
i had the same reaction
|
[16:52] ianbarber
|
hmm
|
[16:52] ianbarber
|
sessions_sync.lock ();
|
[16:52] ianbarber
|
bool registered = sessions.insert (
|
[16:52] ianbarber
|
sessions_t::value_type (name_, session_)).second;
|
[16:52] ianbarber
|
sessions_sync.unlock ();
|
[16:52] ianbarber
|
return registered;
|
[16:52] ianbarber
|
just google, the first ref I found for stl map said insert would return the inserted thing, or an operator
|
[16:52] ianbarber
|
not a bool
|
[16:53] ianbarber
|
so the insert wouldn't report a failure, even if it had encountered a collision?
|
[16:53] ianbarber
|
sure sustrik will correct me in short order :)
|
[17:01] mikko
|
http://www.zeromq.org/docs:builds
|
[17:01] mikko
|
needs information about contributors as well
|
[17:03] ianbarber
|
surely put yourself on there under contributors
|
[17:04] ianbarber
|
others can add themselves I guess would be easiest, it is a wiki
|
[17:25] mikko
|
crlzmq2 fails to build atm
|
[17:30] Guthur
|
mikko: yeah sorry about that
|
[17:30] Guthur
|
I know the issue, but can not fix at the moment, at work
|
[17:31] mikko
|
thats fine
|
[17:31] mikko
|
just making sure that it's known
|
[17:31] neale1
|
mikko: There's a mono environment on the same s390x build server if you want to exploit that. What level of mono do you require? It's 2.7.1 atm, but can be brought up to current git head if required
|
[17:31] Guthur
|
the solution file is the source of the problem
|
[17:32] mikko
|
i got mono 2.6.7
|
[17:32] mikko
|
is there a difference with mono across architectures?
|
[17:32] Guthur
|
I should be able to push a fix in a couple of hours
|
[17:33] neale1
|
mikko: Shouldn't be. I build from same base, just have s390x jit to do bytecode to s390x machine code xlate
|
[17:33] mikko
|
neale1: should i mention that "Linux s390x build slave was contributed by IBM" ?
|
[17:33] neale1
|
No, by Marist College would be better
|
[17:33] mikko
|
neale1: i could add a mono build for clrzmq and clrzmq2
|
[17:34] mikko
|
neale1: sure, will add that
|
[17:34] neale1
|
Yes
|
[17:34] mikko
|
do you want url to anywhere?
|
[17:35] neale1
|
I'll ask the Marist folks
|
[17:35] mikko
|
neale1: http://www.zeromq.org/docs:builds
|
[17:35] mikko
|
you can edit the page to add the link
|
[17:37] neale1
|
mikko: Tks
|
[17:38] ianbarber
|
ah, there's a .second on that sessions code, so it's all good. Stepping through it looks like register session is only being called once, for some odd reason
|
[17:41] neale1
|
mikko: Do I have a wikidot id to update that page?
|
[17:42] mikko
|
neale1: i think you can create one
|
[17:42] mikko
|
neale1: im not sure if you have one already :)
|
[17:44] neale1
|
mikko: Cld u just add this http://www.marist.edu/it/datacenter/systems.html
|
[17:44] mikko
|
neale1: will do
|
[17:45] mikko
|
neale1: done
|
[17:54] sustrik
|
ianbarber: insert returns pair, the code selects the second part of the pair, which is bool
|
[17:55] ianbarber
|
sustrik: spotted that, seems correct
|
[17:55] ianbarber
|
weird thing I had when debugging the dual identities thing is that it only seems to be hit once
|
[17:55] ianbarber
|
so I suspect the problem is further up the chain, in attach or something like that
|
[17:55] sustrik
|
ianbarber: yes, possibly
|
[17:57] sustrik
|
where exactly does it crash now?
|
[17:57] ianbarber
|
Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:182)
|
[17:58] ianbarber
|
i think that's just because if it gets a send fail it just tries to increase buffer size (until it can't)
|
[17:58] sustrik
|
ianbarber: that has nothing to do with duplicate identities imo
|
[17:59] sustrik
|
well, unless duplicate identities cause an infinite or extremely fast generation of commands
|
[17:59] sustrik
|
hm, the latter may be the case...
|
[18:01] ianbarber
|
sustrik: i don't get the crash on connect, only when I attempt to send something to connected peers
|
[18:01] ianbarber
|
so it may be something of that nature
|
[18:02] rook-
|
hello? I have a noob question if someone has a few minutes
|
[18:02] sustrik
|
inabarder: interesting
|
[18:02] sustrik
|
rook-: shoot
|
[18:03] sustrik
|
ianbarber: it looks like large amount of commands is generated for some reason
|
[18:03] rook-
|
I'm looking for a messaging system to support a distributed "eventually consistent" cloud architecture...
|
[18:03] sustrik
|
you need to find out why it is so
|
[18:03] sustrik
|
check the send_command() function to hook into command passing mechanism
|
[18:03] rook-
|
all the mq's I've found seem to support 2 models: durable queues and pub/sub (fanout)
|
[18:04] rook-
|
but nothing seems to support publishing blindly to n-subscribers and guaranteeing that each subscriber receives 1 and only 1 copy of the message
|
[18:04] sustrik
|
it's impossible
|
[18:05] sustrik
|
slow consumers are the problem
|
[18:05] rook-
|
by blindly, I mean that the publisher just writes 1 message and all subs who've scribed (whether they're connected or not) eventually get it
|
[18:05] rook-
|
it looks to me like a pub-sub with a replay service or something similar would solve the problem, but haven't seen anything like it
|
[18:06] sustrik
|
you would need infinite memory to hold the messages for slow/stuck consumers
|
[18:06] rook-
|
infinite/disk
|
[18:06] rook-
|
yea
|
[18:07] sustrik
|
yes, so it can't be done
|
[18:07] rook-
|
um.. I have a hard time believing that...
|
[18:08] rook-
|
since so many services are using eventually consistent frameworks of their own devising
|
[18:08] sustrik
|
shrug
|
[18:09] rook-
|
I mean, there may be the need for a high water mark/"beyond this point the stuck consumer needs help with rebuilding"
|
[18:09] rook-
|
but "can't be done" seems improbable
|
[18:10] sustrik
|
there's an alternative, yes
|
[18:10] sustrik
|
you can block the sender
|
[18:10] sustrik
|
but that means that the slow/stuck consumer can stop all the data distribution
|
[18:10] sustrik
|
even to the properly working consumers
|
[18:11] rook-
|
yea that'd suck :D
|
[18:11] sustrik
|
so it's either-or
|
[18:11] sustrik
|
either drop messages when there no space to store them
|
[18:11] sustrik
|
or block the whole distribution
|
[18:12] rook-
|
hrm...
|
[18:13] rook-
|
I think in our env. a rolling window would potentially be ok... i.e. if the consumer isn't around for a given period of time, then remediation is required for it to rejoin and resync
|
[18:13] sustrik
|
ack
|
[18:13] rook-
|
true
|
[18:13] sustrik
|
that's the standard model for pub-sub
|
[18:14] rook-
|
sorry?
|
[18:14] rook-
|
I thought that pub/sub was limited by immediately connectivity and deliverability
|
[18:14] rook-
|
i.e. if consumer isn't there to receive message, or isn't accepting the message, it may miss out
|
[18:16] rook-
|
is my impression of pub/sub overly simplified?
|
[18:16] rook-
|
/ have I missed something?
|
[18:18] ianbarber
|
rook: that's how pubsub works in 0MQ, without identities. If you have identities set then it'll try and buffer up messages for them.
|
[18:19] sustrik
|
rook-: yes, you are right
|
[18:19] sustrik
|
that's the general model
|
[18:19] sustrik
|
however, different solution add some amount of reliability
|
[18:20] sustrik
|
for example, PGM retransmits the missing packets if they are still in the reliability window
|
[18:20] sustrik
|
etc.
|
[18:21] Guthur
|
is the HWM per durable subscriber or does it effect all globally?
|
[18:21] sustrik
|
it's per peer
|
[18:22] Guthur
|
is there no way to clear it, besides the peer connecting?
|
[18:23] rook-
|
so I can, in 0mq, have a publisher, who just publishes events, and I can have n-consumers that each, with some reliability, receive every message and the publisher doesn't need to know or be configured with what the consumers are, how many of them there are, etc.
|
[18:23] rook-
|
?
|
[18:27] rook-
|
?
|
[18:28] rook-
|
sustrik?
|
[18:31] sustrik
|
Guthur: no
|
[18:31] sustrik
|
rook-: yes
|
[18:32] rook-
|
cool!
|
[18:32] rook-
|
is the size of the window configurable? I clearly have more reading to do, url reference handy? Pointers?
|
[18:32] rook-
|
(thanks for your help btw)
|
[18:33] Guthur
|
rook-: http://zguide.zeromq.org/chapter:all
|
[18:33] Guthur
|
you may have seen it already
|
[18:33] sustrik
|
or check the man pages
|
[18:34] rook-
|
yea saw the docs - to be honest, I find them... difficult.. very friendly and jovial but not as... concise as typical docs ;)
|
[18:34] sustrik
|
there's reference
|
[18:34] sustrik
|
that's consice
|
[18:35] rook-
|
ok I'll poke about
|
[18:35] rook-
|
now that I know that 0mq handles it
|
[18:35] rook-
|
I feel more justified in digging harder for the info :)
|
[18:35] rook-
|
thanks for the help - off to read
|
[18:35] rook-
|
:)
|
[19:45] mikko
|
good evening
|
[19:58] neale1
|
mikko: I am getting an article on 0MQ printed in the April/May edition of z/Journal (a journal for IBM mainframers). It's an introduction to 0MQ for a group more familiar with things like MQSeries
|
[19:59] mikko
|
nice
|
[19:59] mikko
|
i read about ibm system z
|
[19:59] mikko
|
seems like pretty serious computing system
|
[20:00] neale1
|
Yep, there's a major conference in Anaheim later this month which I'm going to. I wasn't able to schedule a 0MQ talk this time but I'll be trying to get one on the agenda for August in Orlando. As I'm on the program committee, it should get on
|
[20:02] mikko
|
what is the conference in Orlando?
|
[20:02] neale1
|
Same as the one in Anaheim - they hold 2 yearly. It's called "SHARE"
|
[20:02] mikko
|
is it mainframy conference?
|
[20:03] neale1
|
Yep
|
[20:03] neale1
|
http://share.confex.com/share/116/webprogram/start.html
|
[20:05] mikko
|
seems like a big conference
|
[20:06] mikko
|
i've worked with some of the technologies mentioned in the talks
|
[20:06] mikko
|
mainly on integrating them with web
|
[20:07] sustrik
|
the introduction to 0mq for mq series users is something that's badly missing
|
[20:08] sustrik
|
neale1: would your article be publicly available?
|
[20:08] neale1
|
sustrik: Yes, z/Journal has it online as well as in a physical magazine.
|
[20:08] sustrik
|
great!
|
[20:09] neale1
|
sustrik: I'm not doing a if you're using WebSphere MQ and you do x, then on 0MQ you do y
|
[20:09] sustrik
|
understood
|
[20:09] neale1
|
It's a straight introduction trying to put it in context of where it would be useful.
|
[20:09] sustrik
|
i mean, the problem is that if you have "corporate messaging" experience
|
[20:09] neale1
|
A lot of folks want to avoid the complexity and cost of MQ
|
[20:09] sustrik
|
you have quite lot of expectations...
|
[20:09] neale1
|
yes
|
[20:10] sustrik
|
that don't apply to 0mq
|
[20:10] sustrik
|
so i think the point of the article is to explain the difference between mq series and 0mq
|
[20:10] sustrik
|
like "what you should expect from 0mq"
|
[20:10] sustrik
|
right?
|
[20:12] sustrik
|
in any case, that kind of introduction is missing so far
|
[20:14] neale1
|
At the moment it's even more basic than that. I want to aim at the sockets programmers who are dreading having to go to a full brokered option.
|
[20:14] sustrik
|
ah, that's easier, yes
|
[20:15] neale1
|
The linux on system z community is about 10 years old now, but most of its growth has been in the past 5. We're seeing a mix of traditional sysadmin/programmer types from the UNIX world combining with the old hands from the traditional mainframe set.
|
[20:16] neale1
|
I want to conclude the article with a set of questions people can ask when they want to select a technology. It might be useful to open the floor here to suggestions.
|
[20:16] sustrik
|
definitely
|
[20:17] sustrik
|
it's rather hard question to ask though
|
[20:17] sustrik
|
as it's not only a technical question
|
[20:18] sustrik
|
and depends on whether there are people skilled in the technology, whether company have licenses bought already etc.
|
[20:18] neale1
|
In my own case I discovered 0MQ when I was tasked with a project that needed to have several "collector" clients receive raw data, package it, and send it to a sink which would process the collected data. In addition I wanted to add collectors at will, ensure delivery, receive configuration information from the server. And do this in a minimal footprint with little affect on overall CPU on the systems where those collectors are ru
|
[20:18] neale1
|
Oh, and not require the customer install a full-fledged broker type system
|
[20:19] sustrik
|
right, then using 0mq makes perfect sense
|
[20:19] sustrik
|
when you spoke about mainframes i though of integrating legacy apps
|
[20:19] sustrik
|
rather than writing new apps
|
[20:20] sustrik
|
but linux subsystem is probably not that legacy-oriented
|
[20:23] neale1
|
Actually, IBM has spent a lot of time/money putting hooks into its "legacy" systems to allow them to integrate with Linux. One thing we'd need to do on the legacy side though is to port the code (or provide a minimal set of things like the tcp: transport)
|
[20:23] sustrik
|
right
|
[20:24] sustrik
|
anyway, if you are going to have a lecture about 0mq
|
[20:24] neale1
|
Within a mainframe you can talk between the legacy systems like z/OS and a z/Linux system at memory-to-memory speeds even though it appears to be a layer2/layer3 network
|
[20:24] sustrik
|
announce it on www.zeromq.org/community (see top right)
|
[20:24] neale1
|
Will do
|
[20:24] sustrik
|
neale1: are you familiar with z/OS as well?
|
[20:26] neale1
|
sustrik: Yes as well as z/VSE and z/VM. The latter two I've been using for *many* years now. I have only a few years experience with z/OS.
|
[20:26] neale1
|
0MQ could easily be ported to what z/OS calls Unix System Services (USS). It has a full UNIX-branded set of APIs.
|
[20:27] sustrik
|
wow
|
[20:27] sustrik
|
thay have a posix-y interface, right?
|
[20:27] neale1
|
Yes, fully posix-compliant
|
[20:27] sustrik
|
that's need
|
[20:27] sustrik
|
neat
|
[20:28] sustrik
|
would be nice to have the port one day
|
[20:28] sustrik
|
but i assume the build system would be the main problem
|
[20:28] neale1
|
THey have most of the common tools too: autoconf/automake/libtool - at what level I'm not sure
|
[20:28] sustrik
|
even better
|
[20:29] sustrik
|
there's a 0mq port to VMS
|
[20:29] sustrik
|
there's a POSIX interface available
|
[20:29] neale1
|
Yes, I saw that.
|
[20:29] sustrik
|
but the problem is that there's no unix shell
|
[20:29] sustrik
|
so the autotools doesn't work
|
[20:29] neale1
|
Ugh
|
[20:30] neale1
|
On USS they have a korn shell but you can also run bash
|
[20:30] sustrik
|
sounds good
|
[20:30] Guthur
|
is there a zeromq irc log?
|
[20:30] sustrik
|
travlr used to log the conversation
|
[20:30] neale1
|
Same goes with z/VM
|
[20:30] sustrik
|
i see
|
[20:31] Guthur
|
ah ha, and he still does
|
[20:31] Guthur
|
cheers sustrik
|
[20:31] sustrik
|
neale1: maybe give it a shot if you have some spare time
|
[20:31] Guthur
|
oh it's a bit behind though
|
[20:32] neale1
|
I'm going to be installing an upgrade of our z/OS system in the near future, so I'll make sure the USS environment and tools are there.
|
[20:32] sustrik
|
the logger used to hang around called 'zmqircd'
|
[20:32] sustrik
|
but it doesn't seem to be there for some time already
|
[20:33] sustrik
|
neale1: nice
|
[20:33] mikko
|
i got logs
|
[20:33] mikko
|
since last Feb i think
|
[20:34] Guthur
|
it only goes up until Feb 4 2011 or something
|
[20:34] Guthur
|
I suppose it just needs the newest ones placed on it
|
[20:36] mikko
|
it should be up to date
|
[20:37] Guthur
|
here http://travlr.github.com/zmqirclog/
|
[20:37] Guthur
|
correct?
|
[20:37] mikko
|
http://valokuva.org/~mikko/zeromq.log
|
[20:37] mikko
|
mine's here
|
[20:37] mikko
|
it just links to my normal irc log from this channel
|
[21:23] Guthur
|
ok, hopefully that's the clrzmq2 build fixed
|
[22:28] Guthur
|
just whipped together this asyncReturn device http://paste.lisp.org/display/119452
|
[22:29] Guthur
|
the return call could be made with a push socket supplying the appropriate address
|
[22:30] Guthur
|
As an aside I have just moved the RunningLoop into the base class
|
[22:31] Guthur
|
Just a simple thing, but thought I would share anyway
|