[Time] Name | Message |
[03:00] cremes
|
rev: inproc works great on windows; i think you probably mean ipc
|
[03:00] cremes
|
that isn't supported on windows yet
|
[03:03] cremes
|
pauldix: you need to reread the section on message envelopes: http://zguide.zeromq.org/page:all#Request-Reply-Envelopes
|
[03:04] cremes
|
in order for your example to work, you need to pass the "null" message part that separates the routing envelope from
|
[03:04] cremes
|
the message body
|
[03:04] cremes
|
i'll put a small code example in a comment on your gist
|
[05:35] sustrik
|
mikko: pong
|
[05:35] sustrik
|
pieter_hintjens: pong
|
[06:38] pieter_hintjens
|
sustrik: :-) good morning, I already backported the 228 fix to 2.1/2.2
|
[06:39] sustrik
|
great, thanks
|
[09:41] mikko
|
sustrik: was wondering 3.x
|
[09:41] mikko
|
is the protocol from 3.x future compatible?
|
[09:42] mikko
|
2.x -> 3.x makes a very painful migration due to protocol not being supported
|
[09:49] sustrik
|
mikko: hi
|
[09:49] sustrik
|
hard to say
|
[09:49] sustrik
|
the protocols can change, eg. if acks are introduced to PUSH/PULL
|
[09:50] sustrik
|
these have to be passed on the wire
|
[09:50] mikko
|
2.x -> 3.x jump seems pretty annoying from user's perspective
|
[09:50] sustrik
|
it is
|
[09:50] mikko
|
needs downtime in many scenarios
|
[09:51] sustrik
|
no idea what else to do though
|
[09:51] sustrik
|
any ideas?
|
[09:52] mikko
|
is it possible to include one byte for protocol version?
|
[09:52] mikko
|
and handle that accordingly
|
[09:52] sustrik
|
how would that work?
|
[09:52] mikko
|
maybe a bit late now when 3.x is out
|
[09:53] pieter_hintjens
|
sustrik: I've been thinking about this
|
[09:53] sustrik
|
refusing the connections with incompatible version numbers>
|
[09:53] sustrik
|
?
|
[09:53] pieter_hintjens
|
we already discussed how to do it
|
[09:53] mikko
|
it would probably need a bit more design in handling protocol in backwards compatible way
|
[09:53] eintr
|
2.x and 3.x are frame / wire compatible though, right?
|
[09:53] pieter_hintjens
|
mikko: it's not too late, this can be added at any point
|
[09:53] sustrik
|
eintr: nope
|
[09:53] pieter_hintjens
|
eintr: nope
|
[09:53] sustrik
|
:)
|
[09:53] eintr
|
okie then, ignore me (haven't had time to actually check out 3.x yet) :)
|
[09:53] pieter_hintjens
|
mikko: the difficulty with designing the 3.0 protocol is it's not documented
|
[09:54] mikko
|
we are quickly going to alienate users if every major version requires shutting down operation
|
[09:54] pieter_hintjens
|
mikko: obviously
|
[09:54] sustrik
|
so, what's the propsal?
|
[09:54] pieter_hintjens
|
well, first off, document the proposed changes to 3.0 already
|
[09:54] pieter_hintjens
|
so we know what we're actually talking about
|
[09:55] pieter_hintjens
|
second, add version detection to the protocol
|
[09:55] pieter_hintjens
|
so that a 3.0 stack can detect a 2.1 peer safely
|
[09:55] pieter_hintjens
|
and then that a 2.1 peer can safely detect / disconnect a 3.0 peer
|
[09:55] sustrik
|
how does that solve the problem?
|
[09:56] pieter_hintjens
|
those are necessary first steps
|
[09:56] sustrik
|
ok, what next?
|
[09:56] pieter_hintjens
|
then, in 3.0 we can look at supporting the old protocol syntax
|
[09:56] pieter_hintjens
|
since the semantics haven't changed
|
[09:56] sustrik
|
ugh
|
[09:56] pieter_hintjens
|
ugh, yes
|
[09:56] pieter_hintjens
|
or, we can make bridges
|
[09:56] pieter_hintjens
|
but neither of these are even possible today
|
[09:57] pieter_hintjens
|
we can't even safely detect version incompatibility
|
[09:57] sustrik
|
that can be done, but the main question is: who's going to maintain 2.0 protocol in 3.0
|
[09:58] pieter_hintjens
|
it shouldn't require *maintenance*
|
[09:58] sustrik
|
?
|
[09:58] pieter_hintjens
|
if you start on the basis of "I don't want to alienate all my users" then you take a sensible path
|
[09:59] pieter_hintjens
|
you change the protocol format but not the semantics
|
[09:59] pieter_hintjens
|
that means it's simple to support both old and new
|
[09:59] sustrik
|
well, the semantics do change
|
[09:59] pieter_hintjens
|
then when you have migrated the bulk of people to the new protocol version
|
[09:59] pieter_hintjens
|
then you start to add new semantics
|
[09:59] sustrik
|
what else would be the point of changing the protocol
|
[10:00] pieter_hintjens
|
e.g. label flag, and different ID formats are not semantic changes
|
[10:00] pieter_hintjens
|
adding version header is not a semantic change
|
[10:00] sustrik
|
ok, true
|
[10:00] pieter_hintjens
|
req-rep reliability is a semantic change but layered on top in any case
|
[10:00] pieter_hintjens
|
so it's doable, if you actually decide it's worth doing
|
[10:01] sustrik
|
i have no resources to do that
|
[10:01] sustrik
|
same problem as with explicit identities
|
[10:01] sustrik
|
the worth of doing so exceeds my bandwidth
|
[10:01] pieter_hintjens
|
you have resources to run but not walk?
|
[10:01] pieter_hintjens
|
e.g. spending 10 minutes to properly document requirements for 3.0 protocol
|
[10:01] pieter_hintjens
|
so that other people can help design a better WLP
|
[10:02] sustrik
|
combining old and new functionality means supporting a lot of complex infrastructure
|
[10:02] pieter_hintjens
|
lack of versioning is a major weakness, due simply to "I don't care about this"
|
[10:02] pieter_hintjens
|
combining old and new means taking more time to move, that's all
|
[10:02] pieter_hintjens
|
like I said, you do a syntax change in one step, semantic changes in later steps
|
[10:02] pieter_hintjens
|
not at the same time
|
[10:03] sustrik
|
how does that solve the problem?
|
[10:04] pieter_hintjens
|
it lets people upgrade safely to 3.0 and then onwards
|
[10:04] pieter_hintjens
|
if you make a barrier to upgrading people will stay on 2.1
|
[10:04] pieter_hintjens
|
being unable to mix old and new is a major barrier
|
[10:04] pieter_hintjens
|
it means larger architectures have to do a big bang upgrade
|
[10:04] pieter_hintjens
|
you're forcing people to test the new functionality all in one go
|
[10:05] sustrik
|
still better then asking them to test new incompatible version several times
|
[10:05] pieter_hintjens
|
?
|
[10:05] pieter_hintjens
|
what you want is to deliver new functionality (e.g API, subscription forwarding, req/rep reliability) on the OLD protocol so people can test it safely
|
[10:06] sustrik
|
you can't
|
[10:06] sustrik
|
new fuctionality needs new protocol
|
[10:06] sustrik
|
like subscriptions on the wire
|
[10:06] pieter_hintjens
|
nope, it doesn't
|
[10:06] pieter_hintjens
|
those are socket pattern semantics
|
[10:06] pieter_hintjens
|
old code will work unchanged
|
[10:06] sustrik
|
it won't
|
[10:06] sustrik
|
subsctriptions will accumulate in the upstream node
|
[10:06] pieter_hintjens
|
xsub/xpub affects my push/pull sockets?
|
[10:07] sustrik
|
and finally it will run out of memory or something
|
[10:07] pieter_hintjens
|
well, this is also a design choice
|
[10:07] sustrik
|
new pub/sub doesn't work with old pub/sub
|
[10:07] pieter_hintjens
|
you _could_ make it compatible
|
[10:07] sustrik
|
to do that i would have to change 2.0 to speak 3.0 protocl
|
[10:07] pieter_hintjens
|
e.g. totally separate socket types
|
[10:07] sustrik
|
ZMQ_PUB1 vs. ZMQ_PUB2
|
[10:07] pieter_hintjens
|
sure
|
[10:07] pieter_hintjens
|
why not?
|
[10:08] sustrik
|
who's going to maintain it?
|
[10:08] rimas
|
hi guys, could anyone help me out with actionscript TCP wrapper for zmq?
|
[10:08] rimas
|
basically I'm having trouble sending data to zmq
|
[10:08] pieter_hintjens
|
rimas: you might have to track down the author...
|
[10:08] pieter_hintjens
|
but I know it works, have seen it in action
|
[10:09] rimas
|
well, recv does work fine, however send doesn't :(
|
[10:09] pieter_hintjens
|
sustrik: it comes down to making smaller changes, getting them live faster, and not making major breaks
|
[10:09] pieter_hintjens
|
and not making multiple major breaks at once
|
[10:10] rimas
|
maybe you guys can explain a bit better on how the whole handshake works with zmq?
|
[10:10] pieter_hintjens
|
rimas: what kinds of sockets are you using?
|
[10:10] sustrik
|
i have no problem with creating new patterns and handing them out to people who are willing to maintain them
|
[10:10] pieter_hintjens
|
sustrik: "maintenance" is a kind of fake problem afaics
|
[10:10] sustrik
|
so, if anyone wants ZMQ_OLD_PUN & ZMQ_OLD_SUB, no problem
|
[10:10] rimas
|
well, i've got python zmq.REP
|
[10:10] pieter_hintjens
|
code that is used, gets maintained
|
[10:10] rimas
|
as a server
|
[10:11] sustrik
|
rimas: there's no handshake, each party just sends one empty message after connecting
|
[10:12] pieter_hintjens
|
rimas: I don't think anyone here knows the AS binding. I'd suggest making a test case, post to the email list, perhaps contact the author
|
[10:13] rimas
|
ok, thanks guys
|
[10:13] pieter_hintjens
|
sustrik: essentially, with 3.0 as it exists today, existing 0MQ users cannot easily upgrade, only new ones can use it
|
[10:13] sustrik
|
yes
|
[10:13] pieter_hintjens
|
however, until people are using 3.0 it will not become stable
|
[10:13] pieter_hintjens
|
and until it is stable, new users will not take the risk of using it
|
[10:13] sustrik
|
sure
|
[10:14] sustrik
|
what does that have to do with the fact i can't handle all the maintenance
|
[10:14] pieter_hintjens
|
conclusion: unless you solve the problem of upgrading, none of the new functionality in 3.0 will ever get widely used
|
[10:14] pieter_hintjens
|
and if it's not used, no-one will maintain it
|
[10:14] pieter_hintjens
|
are you doing a lot of maintenance on 2.1 now?
|
[10:14] sustrik
|
nope
|
[10:15] pieter_hintjens
|
well, why do you keep raising this as an issue then?
|
[10:15] sustrik
|
i don't care about 2-1
|
[10:15] pieter_hintjens
|
I think I've proven that maintenance is a solved problem
|
[10:15] sustrik
|
actually, i don't care about 3-0, you've agreed to maintain it
|
[10:15] pieter_hintjens
|
indeed
|
[10:16] pieter_hintjens
|
but that doesn't mean it'll get used
|
[10:17] sustrik
|
i guess the people who want to use 2.1 protocls and 3.0 functionality should produce the patches
|
[10:17] pieter_hintjens
|
if you want people to use the code you're writing, you can't just push it and then say "I'm not maintaining it"
|
[10:17] pieter_hintjens
|
that won't work
|
[10:17] pieter_hintjens
|
you have made it pretty impossible for anyone to contribute to the protocol
|
[10:17] pieter_hintjens
|
so no-one will propose changes to that
|
[10:17] pieter_hintjens
|
I don't want to be negative
|
[10:18] pieter_hintjens
|
3.0 is pretty cool, I like the simplifications
|
[10:18] sustrik
|
i personally believe that 3.0 should have been dropped in the beginning
|
[10:18] sustrik
|
i've proposed to do so at the time
|
[10:18] sustrik
|
and everyone went like "don't drop it"
|
[10:18] sustrik
|
so here it is
|
[10:18] pieter_hintjens
|
what was the alternative?
|
[10:19] sustrik
|
stabilise 2.x
|
[10:19] pieter_hintjens
|
no future development of 0MQ?
|
[10:19] sustrik
|
exactly
|
[10:19] pieter_hintjens
|
2.x is stable
|
[10:19] pieter_hintjens
|
was stable already
|
[10:19] sustrik
|
support existing users
|
[10:19] pieter_hintjens
|
all that's happening
|
[10:19] sustrik
|
why do we need 3.0 at all?
|
[10:19] pieter_hintjens
|
you enjoyed making it, I assume
|
[10:19] sustrik
|
i can do that work as a private experiment
|
[10:19] pieter_hintjens
|
otherwise you'd have done stuff like make 0MQ robust for Internet use
|
[10:20] sustrik
|
yes
|
[10:20] pieter_hintjens
|
we still have 0MQ/2.2 as an option for thart
|
[10:20] pieter_hintjens
|
we can make 2.2 interoperate with 3.0
|
[10:20] sustrik
|
right
|
[10:20] pieter_hintjens
|
maybe 3.0 is simply too far ahead
|
[10:20] sustrik
|
i don't see much point in imposing new functionality on users that don't want it
|
[10:20] sustrik
|
yes, something like that
|
[10:21] pieter_hintjens
|
well, people do want the subscription forwarding
|
[10:21] pieter_hintjens
|
and you need running code for the SP framework
|
[10:21] pieter_hintjens
|
the reasons for making 3.0 were / are sound afaics
|
[10:22] sustrik
|
but it doesn't play well with stability of the product
|
[10:22] sustrik
|
it's stable vs. experimental trade-off
|
[10:22] pieter_hintjens
|
this is normal, and fine
|
[10:22] pieter_hintjens
|
we have our product versions and cycles well defined and they work
|
[10:23] pieter_hintjens
|
the only flaw here is the difficulty of mixing 3.0 code with 2.x code in one architecture
|
[10:23] pieter_hintjens
|
that's all
|
[10:23] pieter_hintjens
|
I can't test any of the new stuff without taking it all
|
[10:23] pieter_hintjens
|
that's too costly and risky
|
[10:23] sustrik
|
yes
|
[10:23] pieter_hintjens
|
I'd say, either we get 3.0 to speak nice to 2.1, or we do this in 2.2
|
[10:23] pieter_hintjens
|
one or the other
|
[10:23] pieter_hintjens
|
that ensures 3.x will get used
|
[10:24] sustrik
|
i would vote for 2.2
|
[10:24] pieter_hintjens
|
that makes sense
|
[10:24] pieter_hintjens
|
it means the 3.x codebase doesn't get messier
|
[10:24] pieter_hintjens
|
and "speaking with" can be strictly constrained
|
[10:25] sustrik
|
yes
|
[10:25] pieter_hintjens
|
2.2 is in good shape, it has all the 2.1 patches plus a couple of other changes
|
[10:25] pieter_hintjens
|
let's think about this for a while, there's no real hurry
|
[10:26] sustrik
|
ok
|
[10:26] pieter_hintjens
|
if you could send me some WLP-patches for 3.0 we can think about how to add version detection to the protocol in a safe way
|
[10:27] sustrik
|
i can send you the LABEL patch
|
[10:28] sustrik
|
the sub forwarding patch is actually a big bunch of patches that toueches almost every file in the codebase
|
[10:28] pieter_hintjens
|
i'm only concerned with ZMTP at this point
|
[10:28] pieter_hintjens
|
we should also expand that to document the patterns
|
[10:28] sustrik
|
ok, let me check
|
[10:29] pieter_hintjens
|
mikko: does the idea of using 2.2 as a bridge version make sense?
|
[10:31] sustrik
|
LABALE patch: ab99975ad44ed0fe9ab6
|
[10:31] sustrik
|
LABEL
|
[10:32] pieter_hintjens
|
sustrik: at http://rfc.zeromq.org/spec:13#toc16
|
[10:33] pieter_hintjens
|
what changes does the label function make to that?
|
[10:34] sustrik
|
drop more & final
|
[10:34] sustrik
|
introduce 'flags' instad
|
[10:34] sustrik
|
flags = 0x00
|
[10:34] sustrik
|
flags = 0x01
|
[10:34] sustrik
|
flags = 0x80
|
[10:34] sustrik
|
flags = 0x81
|
[10:35] sustrik
|
the lowest bit being LABEL flag
|
[10:35] pieter_hintjens
|
we still use flags for more/final, right?
|
[10:35] sustrik
|
the highest bit being MORE flag
|
[10:35] sustrik
|
yes
|
[10:35] sustrik
|
however, it is split
|
[10:35] pieter_hintjens
|
more-label, final-label, more-data, final-data?
|
[10:36] sustrik
|
othre way round
|
[10:36] sustrik
|
0x80 = more
|
[10:36] sustrik
|
0x01 = label
|
[10:36] pieter_hintjens
|
yeah, I mean semantically
|
[10:36] sustrik
|
yes
|
[10:36] pieter_hintjens
|
in terms of what a message is
|
[10:37] sustrik
|
however, the two are mutually exclusive
|
[10:37] pieter_hintjens
|
0..n label frames, 0.n data frames
|
[10:37] sustrik
|
so more-label doesn't happen
|
[10:37] sustrik
|
exactly
|
[10:37] pieter_hintjens
|
so label always implies more?
|
[10:37] sustrik
|
yes
|
[10:37] pieter_hintjens
|
can you have a message consisting only of label frames?
|
[10:37] sustrik
|
maybe it's wrong
|
[10:38] pieter_hintjens
|
this is why you want protocols discussed independently of code...
|
[10:38] pieter_hintjens
|
does 0MQ allow an empty message as in zero frames?
|
[10:38] pieter_hintjens
|
I don't think it does, does it
|
[10:38] sustrik
|
yes
|
[10:38] pieter_hintjens
|
that's why ZMTP says *more-frame final-frame
|
[10:39] pieter_hintjens
|
so it'd be *label-frame *more-frame final-frame
|
[10:39] sustrik
|
yes
|
[10:39] pieter_hintjens
|
ok
|
[10:39] pieter_hintjens
|
if you read the next section of that page, it proposes a mechanism for version detection
|
[10:41] sustrik
|
it's not compatible with 2.x protocol
|
[10:42] pieter_hintjens
|
it's not compatible with 2.1 code
|
[10:42] pieter_hintjens
|
which would break
|
[10:42] sustrik
|
yes
|
[10:42] pieter_hintjens
|
but that can be patched trivially, IMO
|
[10:42] pieter_hintjens
|
i.e. reject such connections as invalid
|
[10:42] sustrik
|
the problem is that you are back to the original problem
|
[10:42] sustrik
|
you have to deploy 2.2 in one go
|
[10:42] sustrik
|
on gradual upgrading
|
[10:42] pieter_hintjens
|
so, my proposal is that 2.2 actually do version detection
|
[10:43] pieter_hintjens
|
i'm willing to make this, as far as I can
|
[10:43] pieter_hintjens
|
anyhow, design first... 2.2 should be able to speak to 2.1 apps
|
[10:43] sustrik
|
yeah
|
[10:43] pieter_hintjens
|
and 2.2 can also speak to 2.2 apps, using zmtp/1.1
|
[10:44] pieter_hintjens
|
and 2.2 can then speak to 3.0, using zmtp/1.1
|
[10:44] sustrik
|
1.2?
|
[10:44] pieter_hintjens
|
no need, 1.1 will have the layering for xpub/xsub, etc. on top
|
[10:45] sustrik
|
ok
|
[10:45] pieter_hintjens
|
I assume we can either add this explicitly with socket pattern validation, or stick with "if you mix the wrong stuff it breaks, too bad" approach
|
[10:45] pieter_hintjens
|
all that really matters is you can upgrade parts of your network piece by piece
|
[10:46] pieter_hintjens
|
this means the upgrade path is (a) move to 2.2, which is relatively safe because it's the same stable codebase for 95%
|
[10:46] mikko
|
pieter_hintjens: i think ideally from user's perspective tehy shouldnt know about versions
|
[10:46] pieter_hintjens
|
and (b) then start moving to 3.0 piece by piece
|
[10:46] pieterh
|
mikko: the technical user is going to be very aware of versions because they bring specific functionality
|
[10:46] pieterh
|
and risks
|
[10:47] pieterh
|
no avoiding that - new code is always riskier than old code
|
[10:47] pieterh
|
mikko: I made a lovely hard hack yesterday
|
[10:47] pieterh
|
nothing to do with 0MQ...
|
[10:48] pieterh
|
while true; do if test -d /media/*/Android; then banshee --play; else banshee --pause; fi; sleep 1; done
|
[10:48] pieterh
|
dock my mobile phone, music starts playing... undock it, music stops immediately
|
[10:48] jsimmons
|
lol
|
[11:01] mikko
|
pieterh: heheh
|
[11:02] pieterh
|
mikko: I first thought, "buy a kinnect and use it to see if I'm at my desk or not"
|
[11:02] mikko
|
pieterh: it would be fairly simple to do with openni
|
[11:02] mikko
|
gesture based music control
|
[11:02] pieterh
|
then, "hook into screensaver and use that to stop/start music"
|
[11:03] pieterh
|
nice thing is when someone calls me, I just pick up the phone and answer, and the music magically pauses...
|
[11:30] pieterh
|
sustrik: wrt label frames
|
[11:30] pieterh
|
can these be used for all socket patterns?
|
[11:35] sustrik
|
pieterh: yes
|
[11:36] sustrik
|
it's a generic mechanism for tagging messages
|
[11:36] pieterh
|
ok
|
[11:36] sustrik
|
i borrowed the concept from MPLS
|
[11:37] pieterh
|
so version numbering...
|
[11:37] pieterh
|
if we want to do socket pattern validation we also need to send some kind of greeting message
|
[11:38] pieterh
|
hmm, we could already disable identities in this wlp
|
[11:39] sustrik
|
you'll break 2.x compatibility that way
|
[11:39] sustrik
|
which would destroy the whole point
|
[11:39] pieterh
|
well... :-)
|
[11:39] pieterh
|
not really
|
[11:39] pieterh
|
imagine a 2.2 broker taking to 2.1 and 3.0 clients
|
[11:40] pieterh
|
the 2.1 clients can still use explicit identities, 2.2 will support that
|
[11:40] pieterh
|
2.2 to 2.2 won't, and 2.2 to 3.0 won't either
|
[11:40] sustrik
|
3.0 *does* support explicit identities
|
[11:40] pieterh
|
ok
|
[11:41] pieterh
|
so then we have to send a more complex greeting
|
[11:41] pieterh
|
i'll propose something, we can beat that into shape
|
[11:41] sustrik
|
ok
|
[11:59] pieterh
|
sustrik: is there any value in adding a type octet to label frames?
|
[11:59] pieterh
|
right now the syntax of these is undefined
|
[12:04] sustrik
|
pieterh: i would say the actual semntics of the labels is up to the messaging pattern
|
[12:04] pieterh
|
ok... makes sense
|
[12:04] pieterh
|
I've pushed a draft then
|
[12:04] pieterh
|
http://rfc.zeromq.org/spec:15
|
[12:04] sustrik
|
where?
|
[12:04] sustrik
|
let me have a look
|
[12:04] pieterh
|
its simpler than spec:13
|
[12:06] sustrik
|
hm, it's not compatible with 2.1
|
[12:06] pieterh
|
:-) of course not ...
|
[12:06] sustrik
|
so how would the upgrade process work?
|
[12:07] pieterh
|
ah, ok, so this spec lets us make a 2.2 that speaks both old and new protocols
|
[12:07] pieterh
|
since the semantics are the same, just wire format changes a little
|
[12:07] pieterh
|
I'll need to check that 2.1 correctly kills peers that send invalid identities
|
[12:08] pieterh
|
it requires that the 2.1 peer talks to the 2.2 peer first, not the other way around
|
[12:08] sustrik
|
what's an invalid identity?
|
[12:08] sustrik
|
multi-part one?
|
[12:08] pieterh
|
one starting with a null byte, for zmtp/1.0
|
[12:08] pieterh
|
or, yes, a multipart one
|
[12:09] pieterh
|
both are invalid according to zmtp/1.0
|
[12:09] sustrik
|
ok, i see
|
[12:12] pieterh
|
in 2.1, a subscriber sends a greeting right away, doesn't it
|
[12:12] sustrik
|
it sends the identity right away
|
[12:13] pieterh
|
ok, so I also added a sender socket type in the greeting
|
[12:14] sustrik
|
ok
|
[12:15] pieterh
|
this'll do for now, I should be able to make the changes to 2.2 if no-one else feels like it
|
[12:15] pieterh
|
there'll be some cases where mixing versions does not work
|
[12:50] guido_g
|
pieterh: of by one error in spec detected! "Bits 1-6: Reserved. Bits 1-7 are reserved for future use and MUST be zero."
|
[12:50] guido_g
|
hi btw
|
[12:51] guido_g
|
what happens when length is 0?
|
[12:54] sustrik
|
guid_g: invalid message
|
[12:55] guido_g
|
ok
|
[13:49] pieterh
|
sustrik: it's one thing I'd like to change, make frame size exclude any frame meta data (i.e. flags)
|
[13:49] pieterh
|
so that frame size 0 is valid
|
[13:49] sustrik
|
why so?
|
[13:49] pieterh
|
the current design is consistently confusing
|
[13:50] sustrik
|
there are two layers in it actuallyu
|
[13:50] pieterh
|
it confused me, and I've seen several people hit it exactly the same way
|
[13:50] sustrik
|
there's message delimitation protocol (the size + data)
|
[13:50] sustrik
|
and labeling protocol (flags+data)
|
[13:50] sustrik
|
the delimitation protocol doesn't know about labeling protocol running on top of it
|
[13:50] pieterh
|
well, that is an explanation but it's not helpful
|
[13:51] pieterh
|
consistently, people ask "why is a size zero not valid?"
|
[13:51] sustrik
|
it is valid frame
|
[13:51] sustrik
|
not a valid messafe part
|
[13:51] sustrik
|
message
|
[13:51] pieterh
|
we don't actually have two protocols
|
[13:51] sustrik
|
you can split the doc if you want
|
[13:51] pieterh
|
and no other protocols do this
|
[13:51] pieterh
|
that would be insane
|
[13:52] pieterh
|
take web sockets as an example
|
[13:52] pieterh
|
frame header + frame data
|
[13:52] pieterh
|
header includes data length + meta data
|
[13:52] pieterh
|
that is what people expect, it's sane
|
[13:52] pieterh
|
to define two protocols just to define a frame is insane
|
[13:52] pieterh
|
and so, no-one expects it, so it's consistently confusing
|
[13:52] pieterh
|
that's my experience
|
[13:52] sustrik
|
easy to process in hardware
|
[13:53] pieterh
|
i understand that is the underlying rationale but...
|
[13:53] pieterh
|
adding 1 or 2 doesn't change anything IMO
|
[13:53] sustrik
|
it binds the framing layer tightly with layering layer
|
[13:53] pieterh
|
and the problem with that is?
|
[13:54] pieterh
|
do you have any use case for reusing a 1-line framing layer?
|
[13:54] pieterh
|
for various reasons but mainly future IEFT compatibility I'd really try to make it look more like the websockets header
|
[13:54] pieterh
|
*IETF
|
[13:54] sustrik
|
low level hardware
|
[13:54] pieterh
|
unproven
|
[13:55] sustrik
|
say a simple forwarder
|
[13:55] sustrik
|
no need to know the labeling protocol
|
[13:55] pieterh
|
philosophical arguments of "efficiency" vs. practical experience of "this design is confusing"
|
[13:55] sustrik
|
come on, it's just a number
|
[13:55] pieterh
|
and this is not SP, it's just a WLP for 0MQ
|
[13:55] pieterh
|
it's a consistent cause of confusion
|
[13:55] sustrik
|
feel free to change it in 2.2 then
|
[13:56] pieterh
|
making a specific protocol for 2.2 would be kind of stupid
|
[13:56] sustrik
|
you can change it in 3.0
|
[13:56] sustrik
|
you are the maintainer
|
[13:56] pieterh
|
sigh
|
[13:56] sustrik
|
it's your decision
|
[13:56] pieterh
|
sustrik: please don't waste my time
|
[13:57] pieterh
|
really, if I'm explaining why the size field is confusing, accept my explanations and then think about it
|
[13:57] pieterh
|
do *not* give me pointless arguments ending in "well, you maintain it, do what you like"
|
[13:57] sustrik
|
i prefer to design based on technical arguments
|
[13:57] pieterh
|
if the practical experience is outweighed by the philosophy, fine
|
[13:58] pieterh
|
but do not just ignore feedback from your users
|
[13:58] sustrik
|
in this case it's technical argument vs. make it look like websockets
|
[13:58] pieterh
|
and do not, fgs, just tell them to patch your software
|
[13:58] pieterh
|
I am not going to make changes to 3.0, you should understand that
|
[13:58] sustrik
|
what do you want me to do then?
|
[13:59] sustrik
|
imo the "make it look like websockets" argument doesn't make much sense
|
[13:59] pieterh
|
I don't want you to do anything at all except listen to your users and use our accumulated experience to make 0MQ a better product
|
[13:59] pieterh
|
that was not my main argument, it is incidental
|
[14:00] pieterh
|
the websockets framing solves very much the same problem and comes from 1 year of argument on the HyBi list
|
[14:00] sustrik
|
ok, so what the technical argument?
|
[14:00] pieterh
|
the current design is consistently confusing
|
[14:00] pieterh
|
sigh
|
[14:00] pieterh
|
confusing designs are by definition bad
|
[14:01] pieterh
|
layering the framing into two protocols is significant over-engineering that has a negative cost-benefit outcome
|
[14:01] sustrik
|
say IP uses the same design
|
[14:01] pieterh
|
not relevant
|
[14:01] sustrik
|
IP is more common than websockets
|
[14:01] pieterh
|
not relevant, no-one is making IP stacks today
|
[14:02] pieterh
|
look, just tell me this is to allow super fast hardward parsing of frames
|
[14:02] pieterh
|
discussion over
|
[14:02] pieterh
|
*hardware
|
[14:04] pieterh
|
if you turn protocol discussions into "well, you can change it in version X of the software, not my problem"
|
[14:04] pieterh
|
you are effectively saying, "I'm not interested in interoperability"
|
[14:05] pieterh
|
if you find yourself doing this to a significant community of users, your credibility as protocol designer is pretty dead
|
[14:06] reuben
|
what's the downside of doing what pieterh is suggesting?
|
[14:06] reuben
|
I mean, even if there -
|
[14:06] reuben
|
>_>
|
[14:16] sustrik
|
re
|
[14:16] sustrik
|
sorry, got disconnected
|
[14:17] sustrik
|
<sustrik> the problem behind the design is framing in transport-specific while labeling is not
|
[14:17] sustrik
|
<sustrik> ie. sctp has framing of its own
|
[14:17] sustrik
|
<sustrik> so it only needs the labeling part
|
[14:18] sustrik
|
btw, there's one more thing i would like to do
|
[14:19] sustrik
|
it has to do with building new patterns on top of 0mq
|
[14:19] sustrik
|
currently, ROUTER can be used for that
|
[14:19] sustrik
|
but it is basically hijacked XREP
|
[14:20] sustrik
|
so i though of creating a socket type that would fit the role better
|
[14:20] pieterh
|
for ROUTER?
|
[14:20] sustrik
|
like, for example, it could send connection/disconnection messages etc.
|
[14:20] pieterh
|
sure, there are a lot of potentially useful improvements to ROUTER
|
[14:21] sustrik
|
once it is not bound to req/rep
|
[14:21] sustrik
|
we are free to change it
|
[14:21] sustrik
|
so that is suits the purpose better
|
[14:21] pieterh
|
note that for ROUTER to make any sense it has to talk to other socket types
|
[14:21] sustrik
|
my problem with that is that it breaks the semantics of existing patterns
|
[14:22] sustrik
|
however, if the goal is to create new patterns
|
[14:22] pieterh
|
well, what's the alternative to that?
|
[14:22] sustrik
|
to problem disappears
|
[14:22] sustrik
|
the semantics are up to the user space implementation of the pattern
|
[14:22] pieterh
|
you need some other socket type (non-ROUTER) in 95% of patterns
|
[14:22] pieterh
|
DEALER is most common
|
[14:22] pieterh
|
so you could create SERVER and CLIENT
|
[14:23] sustrik
|
you can create dealer-like semantics using generalised router socket
|
[14:23] sustrik
|
no?
|
[14:23] pieterh
|
where SERVER is ROUTER with extensions and CLIENT is like PAIR
|
[14:23] pieterh
|
no
|
[14:23] pieterh
|
you cannot use ROUTER as DEALER
|
[14:23] sustrik
|
why so?
|
[14:23] pieterh
|
srsly?
|
[14:23] sustrik
|
yes
|
[14:23] sustrik
|
no idea
|
[14:23] pieterh
|
didn't you read the guide yet?
|
[14:24] reuben
|
haha
|
[14:24] sustrik
|
if you know about all the active peers
|
[14:24] pieterh
|
no, srsly
|
[14:24] sustrik
|
you can do load-balancing in the user space
|
[14:24] pieterh
|
did you read the guide, where we actually build patterns in userspace, using router?
|
[14:24] sustrik
|
some of it
|
[14:24] pieterh
|
there are 10-20 clear use cases there
|
[14:24] pieterh
|
including *exhaustive* explanation of why router-to-router doesn't make sense
|
[14:24] pieterh
|
except in one single usecase
|
[14:25] pieterh
|
we've discussed this multiple times wrt identities
|
[14:25] sustrik
|
can you give me a pointer?
|
[14:25] pieterh
|
ok,
|
[14:25] pieterh
|
a router cannot route until it gets an incoming message
|
[14:25] pieterh
|
it is a server semantic
|
[14:25] pieterh
|
it needs a client to talk to it first and say, "I'm here"
|
[14:25] sustrik
|
that's why i am saying the connection/disconnection notification would have to be added
|
[14:25] pieterh
|
no difference
|
[14:26] pieterh
|
a router doesn't know addresses of inexistent clients up front
|
[14:26] pieterh
|
the only way
|
[14:26] pieterh
|
is if you use explicit identities
|
[14:26] sustrik
|
that can be added to the generalised router socket
|
[14:26] pieterh
|
and the only sane way I found was to make them ipaddress:port
|
[14:26] sustrik
|
connection happend => user gets notification message containing connection ID
|
[14:26] pieterh
|
so you want to mix router/dealer semantics together
|
[14:26] sustrik
|
happens*
|
[14:26] pieterh
|
into a kind of "whatever" socket
|
[14:26] sustrik
|
yes
|
[14:27] pieterh
|
sure
|
[14:27] pieterh
|
if you're going to do that
|
[14:27] sustrik
|
i can
|
[14:27] sustrik
|
if you find it useful
|
[14:27] pieterh
|
then you can make fair queuing, etc. comfigurable
|
[14:27] sustrik
|
whatever you want
|
[14:27] sustrik
|
basically, it would allow 0mq to be used as a dumb networking framework
|
[14:27] sustrik
|
such as ACE
|
[14:27] pieterh
|
well, to be honest, the problem isn't lack of power
|
[14:27] pieterh
|
the problem is complexity
|
[14:28] pieterh
|
it is not worth just adding functionality if the semantics become too complex to use
|
[14:28] sustrik
|
that's the question
|
[14:28] pieterh
|
so, what I'd suggest is a SERVER / CLIENT semantic
|
[14:28] pieterh
|
very explicit
|
[14:28] pieterh
|
client can connect to exactly one server
|
[14:29] pieterh
|
server is ROUTER plus connect/disconnect notifications
|
[14:29] sustrik
|
right client/server pattern
|
[14:29] pieterh
|
they do heartbeating and possibly some other stuff like credit-based flow control instead of HWMs
|
[14:29] pieterh
|
I have a design for CBFC if you want it, but it's what TCP does, not very complex
|
[14:30] sustrik
|
the problem with that is that that way i would have to implement every new pattern in 0mq
|
[14:30] sustrik
|
no new patterns in user space
|
[14:30] pieterh
|
what do you mean?
|
[14:30] pieterh
|
all the high level patterns could be built on that
|
[14:30] sustrik
|
if someone wanted some other pattern, it would still have to be implemetned insisde 0mq
|
[14:30] pieterh
|
well, I don't see that
|
[14:31] pieterh
|
you could implement all the existing patterns using server/client
|
[14:31] sustrik
|
the one-connection-per-client disables whole class of patterns
|
[14:31] sustrik
|
otherwise it's the same as i am proposing afaics
|
[14:32] pieterh
|
the one connection per client could be configurable
|
[14:32] pieterh
|
you need this in most simple cases to avoid utter confusion
|
[14:32] pieterh
|
but in fact the dealer semantics are most accurate afaics
|
[14:33] sustrik
|
ok, let me explain how i image generic router socket can work
|
[14:33] pieterh
|
go for it
|
[14:33] sustrik
|
you can do any number of binds or connects
|
[14:34] pieterh
|
sure
|
[14:34] sustrik
|
when connection is established, you get an notification with connection identity
|
[14:34] pieterh
|
how do you get notifications? as messages?
|
[14:34] sustrik
|
yes
|
[14:34] sustrik
|
when disconnection happens, you get an notification
|
[14:34] sustrik
|
when you send a message, you prepend it with identity
|
[14:34] pieterh
|
sure
|
[14:35] sustrik
|
when you receive a message it's prepended by identity
|
[14:35] sustrik
|
that's it
|
[14:35] pieterh
|
so just like router except with notifications
|
[14:35] sustrik
|
completely generic
|
[14:35] sustrik
|
yes, something like that
|
[14:35] pieterh
|
now you still need a peer to talk to
|
[14:35] sustrik
|
same socket type
|
[14:35] pieterh
|
clearly you cannot connect two of these sockets together
|
[14:35] sustrik
|
why not?
|
[14:35] sustrik
|
it's just a glorified TCP
|
[14:36] pieterh
|
could you make the identity into a schemed string?
|
[14:36] sustrik
|
whatever
|
[14:37] pieterh
|
it's a neat design, I like it
|
[14:37] sustrik
|
ok, i can do that
|
[14:37] sustrik
|
that way the people with non-stadard use cases still have an option
|
[14:38] pieterh
|
well, apart from pub-sub, about 90% of use cases are non-standard afaics
|
[14:38] sustrik
|
possibly
|
[14:38] pieterh
|
there is pub-sub, pair for inproc pipes, and then req-rep for naive apps, and router-based patterns for everything else
|
[14:39] pieterh
|
which works fine
|
[14:39] sustrik
|
it would allow people to build alternative patterns on top of 0mq
|
[14:39] sustrik
|
if some of them prove useful
|
[14:39] sustrik
|
more effort can be spent in incorporating them directly into 0mq
|
[14:39] pieterh
|
:-) we've been doing this (allowed or not) for about 12 months...
|
[14:39] pieterh
|
since router got documented
|
[14:40] sustrik
|
the problem at the time was that it was actually part of req/rep
|
[14:40] pieterh
|
not really a problem, it worked fine
|
[14:40] sustrik
|
it allowed applications to break req/rep semantics
|
[14:40] pieterh
|
i didn't see that bug report
|
[14:40] pieterh
|
is there a test case?
|
[14:42] sustrik
|
easy, i can create some
|
[14:42] sustrik
|
anyway, the problem is breaking the encapsulation
|
[14:43] sustrik
|
back to generic routing: i can write the code
|
[14:43] sustrik
|
the schemed identities are a bit of problem (testing on different platforms etc.)
|
[14:44] sustrik
|
however, i can provide an extension point for transports
|
[14:44] sustrik
|
to report the schemed identity
|
[14:44] sustrik
|
so that it can be added easily
|
[14:45] pieterh
|
what do you consider as a 'disconnection'?
|
[14:46] pieterh
|
there is TCP disconnect, bound peer treating other as 'gone away', connected peer treating other as 'gone away'
|
[14:46] sustrik
|
true
|
[14:47] pieterh
|
the actual requirements are:
|
[14:47] sustrik
|
explicit identities and thus the notion of session surviving disconnection would have to be removed before creating the new socket type
|
[14:47] pieterh
|
- client mostly never wants a disconnect, but in some cases may want, via heartbeating
|
[14:47] pieterh
|
- server wants to define a configurable timeout after which it can kick dead clients
|
[14:48] pieterh
|
ok, so it's TCP connection (or whatever the transport layer is)
|
[14:48] sustrik
|
ack, way for explicit disconnection would have to be added
|
[14:48] pieterh
|
indeed
|
[14:48] pieterh
|
forget heartbeats, that can be added on top
|
[14:49] sustrik
|
yes
|
[14:49] pieterh
|
so I assume when doing an outbound connect, client would get a 'connected' message when that succeeded?
|
[14:49] sustrik
|
yes
|
[14:50] sustrik
|
one more question: what should be the behaviour on HWM?
|
[14:50] sustrik
|
block, i presume
|
[14:50] pieterh
|
hang on, one other thought
|
[14:50] pieterh
|
for notifications and commands
|
[14:51] pieterh
|
we probably need a reserved identity that is actually the socket itself
|
[14:51] pieterh
|
sys://socket or whatever
|
[14:51] sustrik
|
good question
|
[14:51] pieterh
|
working out-of-band would also be doable
|
[14:51] sustrik
|
i would rather pass them directly via the socket
|
[14:52] sustrik
|
but that requires some thinking about how to distinguish commands from data
|
[14:52] pieterh
|
I mean, frame 0 = "sys://socket", frame 1 = "CLOSE", frame 2 = "identity of peer"
|
[14:52] pieterh
|
that could go in both directions
|
[14:52] pieterh
|
close command same as close notification
|
[14:52] pieterh
|
open command? for consistency? bleh...
|
[14:53] sustrik
|
it'll operate on hop-by-hop level anyway
|
[14:53] sustrik
|
labels/commands are visible at that layer anyway
|
[14:54] sustrik
|
e.g. "subscribe" command
|
[14:54] sustrik
|
you can recv() it from XPUB socket
|
[14:54] sustrik
|
and forward it to XSUB socket
|
[14:54] pieterh
|
is this documented somewhere?
|
[14:55] sustrik
|
zmq_socket(3)
|
[14:55] sustrik
|
also in form of whitepaper: 250bpm.com/pubsub
|
[14:55] sustrik
|
it should probably get into the guide later on
|
[14:56] pieterh
|
the guide will always cover the stable version... so eventually it'll get there
|
[14:57] sustrik
|
ack
|
[14:58] pieterh
|
ok, so you use a simple binary framing for commands...
|
[14:58] sustrik
|
possibly
|
[14:58] pieterh
|
if router socket always sends/recvs address+message
|
[14:58] pieterh
|
then you need an address that means "the socket itself not some other peer"
|
[14:58] sustrik
|
or labal with no data = command ?
|
[14:58] sustrik
|
label
|
[14:59] pieterh
|
then you can use the same approach for commands in the message, e.g. a byte 1 for connect, byte 2 for disconnect
|
[14:59] pieterh
|
labels aren't visible to the application, are they?
|
[14:59] sustrik
|
for example
|
[14:59] sustrik
|
they are visible on hop-by-hop layer
|
[14:59] sustrik
|
invisible on end-to-end layer
|
[14:59] pieterh
|
you said we'd use this socket at end-to-end as well as hop-by-hop
|
[15:00] pieterh
|
you are mixing a lot of concepts here, it feels like mayonnaise
|
[15:00] pieterh
|
with ketchup
|
[15:00] sustrik
|
no, what i mean was that the "generic" socket can be a hop-by-hop concept
|
[15:00] pieterh
|
that has no meaning to me
|
[15:00] sustrik
|
you can build whatever end-to-end layer on top
|
[15:00] sustrik
|
the goal is to allow custom patterns afterall
|
[15:01] sustrik
|
on top = in user space
|
[15:01] pieterh
|
there is a scaling disconnect here
|
[15:01] sustrik
|
sorry?
|
[15:01] pieterh
|
the only problems we're solving with router are where hop-by-hop == end-to-end
|
[15:02] sustrik
|
ok
|
[15:02] pieterh
|
so you are making a distinction that makes no sense to the problem
|
[15:02] pieterh
|
I'd say we're several years away from understanding how e.g. to do multiple hops for the more complex patterns
|
[15:02] pieterh
|
analogous to doing federation in AMQP, that took years to figure out
|
[15:03] sustrik
|
at least the design it's future proof, you can use generic socket to create even complex scalable patterns on top of 0mq
|
[15:03] sustrik
|
but you don't have to
|
[15:03] pieterh
|
power isn't profitable
|
[15:03] pieterh
|
simplicity is
|
[15:03] pieterh
|
simplicity at scale is magic
|
[15:04] pieterh
|
power at scale is just a mess
|
[15:04] sustrik
|
if you have any idea how the end-to-end layer on top of genric socket should look like
|
[15:04] sustrik
|
propose it
|
[15:04] pieterh
|
I'd say you construct that on top
|
[15:04] pieterh
|
out of multiple hops
|
[15:05] pieterh
|
we do that already, e.g. where a client is actually a server for a set of local workers
|
[15:05] pieterh
|
it's just not one pattern, it's different patterns, interconnected
|
[15:06] pieterh
|
anyhow, this is mostly philosophy
|
[15:06] sustrik
|
anyway, i'll implement the hop-by-hop layer first
|
[15:06] sustrik
|
then we can discuss what should be on top of iot
|
[15:06] pieterh
|
"label with no data part means a command" is just confusing several things IMO
|
[15:06] sustrik
|
it
|
[15:06] sustrik
|
possibly
|
[15:06] pieterh
|
command = simplest possible way for application to pass connection meta data to/from socket
|
[15:07] sustrik
|
i have no opinion on framing commands atm
|
[15:07] sustrik
|
does "subscribe" command fit the definition?
|
[15:07] pieterh
|
so for HWM
|
[15:07] pieterh
|
it should block, yes
|
[15:07] sustrik
|
ack
|
[15:08] pieterh
|
i think the rest of ROUTER semantics are accurate, e.g. sending to non-connection means 'discard'
|
[15:08] sustrik
|
as you wish
|
[15:08] sustrik
|
it can report an error though
|
[15:08] sustrik
|
no idea what's better
|
[15:08] pieterh
|
that would be nicer, yes
|
[15:09] pieterh
|
sending to a non-existent peer is bad enough to possibly warrant an assertion even
|
[15:09] sustrik
|
ok
|
[15:09] pieterh
|
an error would be good
|
[15:09] sustrik
|
actually, error
|
[15:09] pieterh
|
how about naming the socket type?
|
[15:10] sustrik
|
dunno
|
[15:10] sustrik
|
ZMQ_GENERIC
|
[15:10] sustrik
|
whatever
|
[15:10] pieterh
|
well, it could be an evolution of ROUTER
|
[15:10] pieterh
|
which replaces ROUTER and DEALER
|
[15:10] pieterh
|
a REALER
|
[15:10] pieterh
|
or DOUTER
|
[15:10] sustrik
|
the problem is backward compatibility
|
[15:11] pieterh
|
zmq_setsockopt
|
[15:11] pieterh
|
but ok
|
[15:11] sustrik
|
ZMQ_RAW
|
[15:11] pieterh
|
well, it's not raw
|
[15:11] pieterh
|
actually, PEER would work here IMO
|
[15:12] sustrik
|
maybe
|
[15:12] pieterh
|
if this works, we can eventually kill router & dealer
|
[15:13] sustrik
|
that would be nice, but would break backward comaptibility :|
|
[15:13] sustrik
|
anyway, we can figure that out later on
|
[15:13] pieterh
|
well, "kill" in the sense of sending to a far island and pretending we never heard of them...
|
[15:13] sustrik
|
right
|
[15:14] sustrik
|
last question: is usign generic socket any simpler that using raw TCP?
|
[15:14] sustrik
|
using*
|
[15:14] pieterh
|
so, suggestion... why not make this as an extension of router, enabled by setsockopt
|
[15:14] pieterh
|
oh, far far far simpler
|
[15:14] sustrik
|
hm
|
[15:14] sustrik
|
message delimitation
|
[15:14] pieterh
|
there's about 5,000 lines of code saved
|
[15:14] sustrik
|
what else?
|
[15:15] pieterh
|
you mean, apart from framing, multiparts, async i/o, error handling, portability, epoll/kpoll/select, ?
|
[15:16] sustrik
|
framing, multipart and portability
|
[15:16] pieterh
|
apart from working over multiple transports invisibly
|
[15:16] sustrik
|
bingo
|
[15:16] pieterh
|
performance
|
[15:16] sustrik
|
muliple transports is a killer feature
|
[15:16] sustrik
|
hm
|
[15:16] pieterh
|
yes
|
[15:16] sustrik
|
actually, it won't work over multicast
|
[15:17] pieterh
|
but so it the rest, I know cause I'm reimplementing a lot of it in VTX
|
[15:17] pieterh
|
*is the rest
|
[15:17] sustrik
|
true
|
[15:17] pieterh
|
asynchronous connects
|
[15:18] pieterh
|
reconnects, from the client out to the server
|
[15:18] sustrik
|
yes
|
[15:18] sustrik
|
there's value in it
|
[15:18] sustrik
|
you are right
|
[15:19] pieterh
|
i can actually tell you how many days 0MQ saves in a realistic TCP client-to-server case
|
[15:19] pieterh
|
without TCP, minimal cost to make a scalable client-server design is about 3 months
|
[15:19] pieterh
|
without *0MQ, sorrt
|
[15:20] pieterh
|
with 0MQ, about 4 hours
|
[15:20] sustrik
|
in C, right?
|
[15:20] pieterh
|
in C, on one single platform (Linux)
|
[15:20] pieterh
|
if you were copying a lot of code, make that 2-3 weeks
|
[15:20] sustrik
|
ok, makes sense
|
[15:21] pieterh
|
0MQ still has big holes but we assume these can be fixed without hurting the apps
|
[15:21] pieterh
|
e.g. paranoia about ... oh, I have a great idea...
|
[15:21] pieterh
|
we could make just this socket paranoid, to start with
|
[15:21] sustrik
|
in what sense?
|
[15:21] pieterh
|
max clients, max message size, max queue memory used, etc.
|
[15:21] pieterh
|
I'm not sure if that's easier on a per-socket basis or globally
|
[15:22] sustrik
|
globally is better
|
[15:22] pieterh
|
ok, I didn't say anything
|
[15:22] sustrik
|
the infrastructure for that is shared
|
[15:22] pieterh
|
also the real magic with 0MQ sockets over TCP is that you use them both for talking to the outside world
|
[15:22] pieterh
|
and for internal multithreading
|
[15:23] pieterh
|
that's the real killer feature IMO
|
[15:23] sustrik
|
have you seen that use case in the wild?
|
[15:24] pieterh
|
every single advanced use case is multithreaded
|
[15:24] pieterh
|
in the Guide too
|
[15:24] pieterh
|
it's inevitable
|
[15:24] pieterh
|
and it would be impossible if we had one loop for inter-thread events and one for inter-process events
|
[15:24] sustrik
|
ah, that way
|
[15:24] pieterh
|
s/impossible/horribly painful/
|
[15:24] sustrik
|
right
|
[15:25] pieterh
|
coffee, brb
|
[15:37] ssi
|
I have a pretty good internal multithreading application with 0MQ
|
[15:37] ssi
|
it's kind of like a pipeline pattern, but with LRU queue for flow control
|
[15:46] sustrik
|
what i originally imagined people moving pieces of functionality from inside the process (threads) into separate applications
|
[15:46] sustrik
|
what pieter had in mind was that in-process and remote logic can be monitored in a single event loop
|
[15:47] sustrik
|
the in-process transport itself is definitely useful as such though
|
[16:10] cremes
|
sustrik: i use inproc for inter-thread communications in my programs instead of mutexes
|
[16:10] cremes
|
works great
|
[16:11] cremes
|
and its conceptually much simpler for me to understand
|
[16:11] sustrik
|
cremes: yes, understood
|
[16:18] pieterh
|
sustrik: we have both use cases
|
[16:18] pieterh
|
in CZMQ I call these 'detached' and 'attached' threads
|
[16:19] pieterh
|
detached threads use their own context, and communicate over IPC or TCP or PGM
|
[16:19] pieterh
|
attached threads share the parent context and talk to the parent thread over an inproc 'pipe' (a pair of PAIR/inproc sockets CZMQ creates automatically)
|
[16:19] pieterh
|
this is cleaner than saying things like "create one context per process"
|
[16:34] sustrik
|
pieterh: what we are going to do about the SIGPIPE thing
|
[16:34] sustrik
|
i have nowhere to test it
|
[16:34] pieterh
|
ah, right, got a little distracted
|
[16:34] sustrik
|
do you have a mac?
|
[16:34] pieterh
|
we have a user with a reproducible case
|
[16:35] pieterh
|
Bill Hathaway
|
[16:35] pieterh
|
I'll make a patch, send it to him, ask him to try it
|
[16:35] sustrik
|
ok
|
[16:35] pieterh
|
there's one place we set options on new sockets, right?
|
[16:36] sustrik
|
you mean tcp sockets?
|
[16:36] sustrik
|
yes
|
[16:36] sustrik
|
tcp_socket.cpp
|
[16:36] sustrik
|
:)
|
[16:36] pieterh
|
goodo
|
[16:41] pieterh
|
sustrik: ok, patched, we'll see what Bill says.
|
[18:24] wailupe2k
|
So have a queue that pulls from a zmq and broadcasts to another, and periodically after a few million events it will display Assertion failed: nbytes == sizeof (command_t) (mailbox.cpp:194), The publisher is perl, and the subscriber is node.js.... Any ideas?
|
[18:36] ssi
|
hrm this is annoying... trying to map my workers by socket identity, and the socket identity is coming back empty
|
[18:36] ssi
|
it's after the socket's been created, connected, and a message sent
|
[18:40] ssi
|
hrm, nm
|
[18:41] guido_g
|
http://api.zeromq.org/2-1:zmq-getsockopt <- see ZMQ_IDENTITY
|
[18:41] ssi
|
it was some kind of goofiness in constructing a string from the byte[] that identity returns
|
[18:41] ssi
|
(jzmq)
|
[18:41] guido_g
|
ahh ok
|
[18:42] guido_g
|
oh twitter says pieterh is happy with amqp/1.0 ]:->
|
[18:43] pieterh
|
guido_g: happy in the sense that watching other peoples' disasters makes us all feel more alive, yes
|
[18:43] ssi
|
hahah
|
[18:43] guido_g
|
13 states
|
[18:44] guido_g
|
sounds like a bad movie :)
|
[18:46] pieterh
|
A bad Czech movie from 1993, "13 States of Connection"
|
[18:46] pieterh
|
Starring Ivan N. Terpryze
|
[18:47] guido_g
|
http://www.imdb.com/title/tt1473773/
|
[18:47] guido_g
|
Drama 2010
|
[18:47] guido_g
|
Not yet released
|
[18:47] guido_g
|
so true :)
|
[18:47] pieterh
|
sweet lord it actually exists?!
|
[18:47] pieterh
|
what are the chances of that?
|
[18:48] guido_g
|
Quotes
|
[18:48] guido_g
|
Matthew O'Connor: We need someone who's professional and reliable - 'cos we're not.
|
[18:48] guido_g
|
omg
|
[19:54] cremes
|
wailupe2k: what version of libzmq and what OS?
|
[19:56] wailupe2k
|
BSD, ZMQ 2.1.7
|
[19:57] sustrik
|
that's the old problem with socketpair buffer limits
|
[19:57] sustrik
|
is solved in 3.0
|
[19:57] wailupe2k
|
hum
|
[19:57] wailupe2k
|
do you know of a BSD port for 3?
|
[19:59] sustrik
|
you can just build it
|
[19:59] sustrik
|
the problem is rather with bindings
|
[19:59] sustrik
|
3.0 was released day before yesterday
|
[19:59] wailupe2k
|
ah,
|
[19:59] sustrik
|
and binding haven't caught up yet
|
[19:59] reuben
|
3.0 was released?
|
[19:59] wailupe2k
|
this is on a prod sys :(
|
[20:00] sustrik
|
well, i can point you to the patch
|
[20:00] sustrik
|
you can try to backpoty it to 2.1.7
|
[20:00] sustrik
|
backport*
|
[20:00] wailupe2k
|
cool
|
[20:01] sustrik
|
wait a sec
|
[20:02] sustrik
|
this is the patch: https://github.com/zeromq/libzmq/commit/7c0c79812075459765440ca26bad56f4f7ddbe52
|
[20:03] wailupe2k
|
hum,,, lots got cut out there.....
|
[20:03] wailupe2k
|
Thanks though!
|
[20:03] wailupe2k
|
sustrik++
|
[20:11] cremes
|
wailupe2k: look up the faq and tuning guides on zeromq.org
|
[20:11] cremes
|
wailupe2k: and adjust your system socket buffers
|
[20:12] wailupe2k
|
cremes: oh, cool I'll check that out right now
|
[20:12] cremes
|
0mq 2.x uses a socketpair internally for signaling so the small bsd defaults cause problems like what you are seeing
|
[20:16] pieterh
|
sustrik: do you have a Jira issue for that pipes problem?
|
[20:16] pieterh
|
I'm trying a backport right now
|
[20:16] sustrik
|
hm, let me see
|
[20:22] sustrik
|
probably not
|
[20:22] sustrik
|
i can't find it
|
[20:23] pieterh
|
perhaps https://zeromq.jira.com/browse/LIBZMQ-166?
|
[20:23] mikko
|
php extension should be up to date with 3.0 soonish
|
[20:23] sustrik
|
it's a mix of several issues
|
[20:23] sustrik
|
the third one seems to fit
|
[20:23] sustrik
|
mikko: nice
|
[20:24] pieterh
|
mikko: you mean with 3.0 support?
|
[20:24] mikko
|
yes
|
[20:24] mikko
|
should work with both
|
[20:24] pieterh
|
mikko: very nice
|
[20:25] pieterh
|
sustrik: it'll have to do then, I need an issue for any change in 2.1...
|
[20:25] sustrik
|
sure, create one
|
[20:25] xristos
|
can someone take a look at this: http://paste.lisp.org/display/123275
|
[20:25] xristos
|
it is very short
|
[20:25] sustrik
|
i guess it was never recorded in the bug tracker
|
[20:25] pieterh
|
do we have a test case for this? does the shutdown stress test do it ?
|
[20:25] xristos
|
i start a server that binds to a pub socket
|
[20:26] xristos
|
then start 1000 clients that subscribe to it
|
[20:26] xristos
|
most of the time i get # Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:183)
|
[20:26] sustrik
|
pieterh: try shutdown_stress + increase number of parallel threads (THREAD_COUNT)
|
[20:26] xristos
|
sometimes it works
|
[20:26] pieterh
|
sustrik: is xristos' example not the socketpair exhaustion?
|
[20:26] sustrik
|
basically, to reproduce it you need an application thread that is evented but doesn't process the events
|
[20:27] sustrik
|
so, for example you can create a socket, bind it and never touch it again
|
[20:27] sustrik
|
then create a destroy peers quickly
|
[20:27] pieterh
|
sustrik: is this is? # Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:183)
|
[20:27] sustrik
|
yes
|
[20:27] pieterh
|
lol
|
[20:28] pieterh
|
xristos: we're actually looking for you and your test case... welcome!
|
[20:28] xristos
|
cool
|
[20:28] xristos
|
i use 256 clients in production
|
[20:28] xristos
|
and it works fine
|
[20:28] pieterh
|
literally, the line before you posted was asking if we had such a test case
|
[20:28] xristos
|
but if i could pump them up to 1000+
|
[20:28] pieterh
|
what language is this?
|
[20:29] xristos
|
python
|
[20:29] pieterh
|
xristos, can I ask you to create an issue for this, with your test case, at https://zeromq.jira.com
|
[20:29] pieterh
|
then when/if we have a fix I'll ask you to test it for us
|
[20:30] xristos
|
http://paste.lisp.org/display/123275#2
|
[20:30] xristos
|
this should be better
|
[20:30] xristos
|
sure
|
[20:30] pieterh
|
actually a plain C testcase would be best
|
[20:30] pieterh
|
if I can I'll make one, based on your Python code
|
[20:31] pieterh
|
thanks for coming to IRC, you can't know what perfect timing that was
|
[20:31] sustrik
|
pieterh: should be easy to reproduce, you can use inproc for that
|
[20:31] sustrik
|
s = socket (ZMQ_PUB);
|
[20:31] sustrik
|
s.bind ("inproc://a")'
|
[20:32] sustrik
|
while (true) {
|
[20:32] sustrik
|
s2 = socket (ZMQ_SUB);
|
[20:32] sustrik
|
s2.connect ("inproc://a");
|
[20:32] sustrik
|
s2.close ();
|
[20:32] sustrik
|
}
|
[20:33] pieterh
|
I'll try it
|
[20:37] pieterh
|
sustrik: ack, that kills it, after 430 sockets
|
[20:38] pieterh
|
xristos: I've got a native C test case working
|
[20:38] xristos
|
ok
|
[20:39] xristos
|
it's only for PUB/SUB from what i see here
|
[20:39] xristos
|
PULL/PUSH works fine up to 5k
|
[20:39] xristos
|
that i've tried
|
[20:55] xristos
|
pieterh: will you file the issue yourself?
|
[20:55] pieterh
|
sustrik: well, I have a test case but it dies at 510 sockets with 'too many open files' (after the change)
|
[20:56] pieterh
|
xristos: I'd prefer if you file the issue, and test the eventual change
|
[20:56] pieterh
|
it is better that way
|
[21:02] xristos
|
filed
|
[21:02] pieterh
|
cool!
|
[21:02] pieterh
|
well, I think I have a fix for it, seems to work in 2.1.x
|
[21:03] pieterh
|
xristos: are you up to building 2.1.x from git master?
|
[21:07] xristos
|
i will, tomorrow, i'm leaving work atm
|
[21:10] pieterh
|
ok, I'll post some comments the jira issue
|
[21:23] benwaine
|
hello
|
[21:24] pieterh
|
benwaine: hi
|
[21:25] benwaine
|
I've been reading through 'the guide' and have a question. Early on it mentions 'if you are opening lots of sockets all the time your probably doing it wrong' (paraphrase)
|
[21:26] benwaine
|
I wanted to ask - i'm thinking of having a fixed queue which messages are pushed into.
|
[21:27] benwaine
|
A user would enter a term into the ui and on the server side a socket would be connected to the queue and that once message would be sent.
|
[21:27] benwaine
|
thats a lot of connects. Is this wrong?
|
[21:28] pieterh
|
benwaine: you'
|
[21:28] pieterh
|
you'd probably want to keep server processes running
|
[21:28] pieterh
|
and only open / connect sockets the first time
|
[21:29] pieterh
|
if only because opening / connecting a socket over TCP takes a small while and that'll add latency to your UI for nothing
|
[21:31] benwaine
|
hmm ok. I'm writting this in PHP. Can you suggest how I should send the message to the queue without using a connect. Some kind of wrapper round the push socket?
|
[21:33] pieterh
|
doesn't php let you keep server processes alive?
|
[21:33] pieterh
|
this is like database handles, I guess
|
[21:34] benwaine
|
I could run a script in a big loop (this is what i'm doing for the actual queue) but i think the connect would have to be on a per request basis. I'm still knew to this thought. Open to suggestions :)
|
[21:34] pieterh
|
I don't know PHP in this regard
|
[21:35] pieterh
|
my advice is to open sockets only when you need to, cache them, reuse them
|
[21:35] pieterh
|
that'll give you best performance
|
[21:35] pieterh
|
however if you need to get started simply, just open them on demand, close when done
|
[21:35] benwaine
|
ok. I'll have a play round and keep reading the guide. Thanks for your time.
|
[21:35] pieterh
|
np
|
[21:36] pieterh
|
sustrik: ok, backported the socketpairs changes to 2.1...
|
[21:36] pieterh
|
it was quite a lot of changes, more than that single commit
|
[21:36] dontfudge
|
Hey all. I'm trying to get zmq/lua/zmsg going. I can't seem to find anywhere czmq for lua. Any clues?
|
[21:36] pieterh
|
dontfudge: czmq is for C, not Lua
|
[21:37] dontfudge
|
There's a example of Asynchronous client-server in Lua on zguide that requires zmsg, but I can't figure out where to get it.
|
[21:38] ianbarber
|
benwaine: yeah, you'd likely create and drop a socket each time in that model with php. The overhead is pretty low.
|
[21:38] ianbarber
|
dontfudge: it's usually in the same examples directory
|
[21:39] benwaine
|
thanks Ian
|
[21:39] dontfudge
|
It's not found in the git, but is here http://zguide.zeromq.org/lua:asyncsrv
|
[21:39] ianbarber
|
https://github.com/imatix/zguide/blob/master/examples/Lua/zmsg.lua
|
[21:39] dontfudge
|
yeah, saw that - but that's just an example on how to use zmsg in lua
|
[21:40] ianbarber
|
no, that's it
|
[21:40] ianbarber
|
zmsg started life as a helper for the guide examples in C, it got ported along with those examples
|
[21:40] dontfudge
|
<sheepish grin> ok thanks
|
[21:40] ianbarber
|
:)
|
[21:40] pieterh
|
ianbarber: thanks
|
[21:41] ianbarber
|
:)
|
[21:41] ianbarber
|
what are you porting down?
|
[21:41] ianbarber
|
the protocol bits for 2.2?
|
[21:41] pieterh
|
ianbarber: removing the socketpair limitations
|
[21:41] ianbarber
|
ah, cool
|
[21:42] pieterh
|
surprisingly it all seems to work after I've been at it with a hammer
|
[21:43] ianbarber
|
you'll be a converted C++er by the end of the year :)
|
[21:43] pieterh
|
ianbarber: sigh, it's looking that way... :-(
|
[21:44] pieterh
|
ianbarber: does the notion of using 2.2 as a stepping stone between 2.1 and 3.0 seem reasonable?
|
[21:45] ianbarber
|
yeah, i think it's good, means 3.0 can stay clear of b/c code, and people can bridge their infrastructure
|
[21:45] pieterh
|
good
|
[21:45] pieterh
|
ok, I'm off for the weekend, be back Monday
|
[21:46] pieterh
|
meetup in Geneva on Saturday, in case anyone's in that zone
|
[21:47] ianbarber
|
cool, have a good one, say hi to alvaro
|
[21:48] pieterh
|
for sure!
|
[22:32] gdan
|
has anyone tested to see which is faster: inproc or ipc?
|
[22:50] cloudhead
|
gdan: I'd assume inproc is faster, seeing as it's limited to a single process
|
[22:55] whack
|
cloudhead_: and wouldn't copy from process -> kernel and back again, probably also less syscalls
|
[22:56] whack
|
but at the point where your maxing out a tcp/inproc/ipc thing you're probably hitting limits on cpu cache misses, etc
|
[22:57] cloudhead
|
yea ipc will have some overhead
|