ZeroMq IRC Log

Wednesday September 15, 2010

[Time] Name	Message
[01:50] kenkeiter	Okay, Rubyists with zmq experience -- question!
[02:23] Xin	test
[02:25] Xin	Does it make sense to add UDP support to 0mq?
[08:06] sustrik	Samy: pong
[08:06] Zao	Doesn't PGM stuff travel on UDP already?
[08:09] sustrik	Zao: PGM travels on IP
[08:09] sustrik	epgm is a convenience hack that travels on UDP instead of IP
[08:09] Zao	Oh, right.
[08:09] Zao	Never used any of them, so I went with faint memories of manpages past.
[09:18] omarkj	kleppari: Morning.
[09:29] mrm2m	Hey ho!
[09:30] mrm2m	Is there a lightweight introduction, how zmq works internally?
[09:33] pieterh	mrm2m: there is documentation of the source code and there are whitepapers
[09:33] pieterh	but there is nothing lightweight about 0MQ's internals, really
[09:34] mrm2m	ok
[09:39] Bruc	hey wat sup all
[10:00] kleppari	omarkj: hah, hi!
[10:01] omarkj	kleppari: Didn't notice you here until now, are you guys using 0mq?
[10:01] kleppari	I'm experimenting a bit
[10:02] kleppari	I just love the idea of message passing for concurrency
[10:03] kleppari	well, outside of erlang at least
[10:05] kleppari	how about yourself?
[10:07] omarkj	Working with it.
[10:07] omarkj	Let's go pm. :)
[12:49] keffo	sustrik, I think I might have figured out the concurrency issue
[12:49] sustrik	cremes: here i am
[12:50] sustrik	keffo: so what's the problem?
[12:51] keffo	I was rewriting (using poll rather than sequential) and to clean up logging etc, and I just noticed one place where I send a null message before sending the rest, except "the rest" already included the incomin route-null, so I suspect it was being sent twie
[12:52] keffo	twice
[12:52] keffo	nullmore, nullmore, uuidmore, data, etc..
[12:52] sustrik	ok, i see
[12:53] sustrik	so no problem in 0mq to worry about
[12:53] keffo	Not at all
[12:53] sustrik	:)
[12:53] keffo	I never expected that either :)
[12:53] sustrik	you are an optimist then :)
[12:54] keffo	It's been very stable and well behaved so far, apart from this issue which is my own fault :)
[12:55] keffo	I did 100k pi calcs yesterday, with 10 worker procs on different machines.. the loadbalancer had ~500kb/s passing through it, and in total more than half a gig of logs were generated.. That's when I decided the logging was way wonky, didnt really tell much
[13:25] cremes	sustrik: i'm available for the next hour or so if you want to chat
[13:45] sustrik	cremes: hi
[13:46] sustrik	give me some context first: why have you moved from ruby-ffi to rbzmq?
[13:46] cremes	there are multiple ruby runtimes; not all of them support the C extension api that rbzmq uses
[13:47] cremes	ffi allows me to support all of the runtimes
[13:47] cremes	oh, and i haven't moved from one to the other; i would just like rbzmq to have the same api as ffi-rzmq
[13:48] sustrik	ok, i see
[13:48] sustrik	it looks like brian buchanan is not reponding, right?
[13:48] cremes	right
[13:48] sustrik	ok, i have admin access to the project
[13:49] sustrik	i'll add you there as a developer
[13:49] mato	sustrik: cremes: a question; i was wondering myself why there were two ruby bindings
[13:49] mato	is there any point in keeping both around?
[13:49] sustrik	hi
[13:49] mato	if FFI is the way to go then why keep rbzmq at all?
[13:50] cremes	um... i don't know; it's probably best to get feedback from folks using rbzmq and ask them why they prefer it
[13:50] cremes	i'm biased towards the ffi one, obviously ;)
[13:51] mato	that might be a good idea; having two bindings is IMO confusing
[13:51] sustrik	does it work with any runtime?
[13:51] cremes	that's why i was hoping brian buchanan would speak up
[13:51] mato	cremes: is there a mailing list for this? I've not seen any discussion on zeromq-dev...
[13:52] mato	maybe just ping there, outline the situation (two bindings, ffi works on all runtimes, etc.) and ask the community
[13:52] cremes	this has not been brought up on the zerome-dev list
[13:52] cremes	i'll do that
[13:53] sustrik	can you explain how does the runtime interoperability work?
[13:53] sustrik	so far i had an impression that ffi doesn't work with all the ruby runtimes
[13:53] cremes	sustrik: as of a few days ago, ffi works with all of the runtimes
[13:53] sustrik	aha
[13:53] cremes	i worked out the last details with the runtime guys
[13:53] sustrik	so it's new
[13:54] cremes	the issue with C extensions is that they oftentimes access lots of internal memory structures
[13:54] sustrik	that may actually mean that rbzmq is not needed any more
[13:54] mato	cremes: what's the status of FFI support for Ruby MRI (i.e. that which most people are using right now)?
[13:54] cremes	ffi allows the runtime authors to hide their implementation details and provide a more solid mechanism for accessing C libraries
[13:55] sustrik	i see
[13:55] cremes	mato: it's quite good
[13:55] mato	cremes: right, but does "quite good" mean "production quality"?
[13:55] mato	i guess what i'm asking is if rbzmq goes away then what will that mean for the current mainstream of Ruby MRI users
[13:56] mato	will they all just happily start using the FFI binding, or not?
[13:56] cremes	mato: perhaps; i can only speak for how it works with 0mq; obviously we still have hangs until 2.1.x comes out with EINTR support for blocking calls
[13:56] mato	sure, but the EINTR hangs are orthogonal to FFI/non-FFI
[13:56] sustrik	well, whatever happens the rbzmq project is not going to be killed
[13:56] sustrik	so no worry
[13:56] cremes	mato: i'll ask for feedback on the ML
[13:57] sustrik	the problem is that it doesn't have a permanent maintainer
[13:57] cremes	mato: sure, it's orthogonal, but it's hard to say if there are other bugs when you get a hang
[13:57] cremes	is it due to the blocking behavior or something else?
[13:57] mato	well then put a note up about that (lack of maintainer) on the rbzmq project page
[13:57] cremes	i don't have the time or energy to run stuff under gdb all of the time to figure that out :)
[13:58] cremes	let's take this to the ML and see what users have to say
[13:58] mato	yup
[13:58] sustrik	ok
[13:58] cremes	btw, from a search on github most projects are using the ffi-based binding
[13:58] cremes	but i certainly don't want to lose or alienate the rbzmq users
[13:59] cremes	which takes us full circle; i'd like each one to present the same api to the user (based off of the C and C++ binding apis)
[13:59] mato	definitely
[13:59] mato	otherwise it just all gets way too confusing
[13:59] cremes	if people can switch back and forth with no code changes, that's a great situation for the community
[13:59] mato	is it much work to synchronize the APIs?
[14:00] cremes	mato: not a lot but unfortunately i don't have the C chops to do it myself
[14:00] cremes	so i need someone else to agree to it
[14:00] cremes	to do the work
[14:00] mato	cremes: well, ask for help on the list and we'll see what happens
[14:00] cremes	will do
[14:05] sustrik	mato: man bug!
[14:05] sustrik	zmq_socket(3):
[14:05] sustrik	"When a ZMQ_XREQ socket is connected to a ZMQ_REP socket each message sent must consist of an empty message part, the delimiter, followed by one or more body parts."
[14:05] mato	sustrik: !!?!? :-)
[14:06] sustrik	the identities should be mentioned
[14:06] sustrik	they are definitely mentioned for ZMQ_XREP
[14:06] ptrb	yeah, it's not really clear to me what work i have to do at the app level to do XREQ/REP (or REQ/XREP)
[14:07] mato	uh, yeah, i kind of punted on explaining the identity stack on the XREQ side at the time
[14:07] mato	sustrik: that stuff was written in a hurry, you remember :)
[14:07] sustrik	ptrb: there was a nice diagram somewhere...
[14:07] mato	sustrik: anyway, it's not a bug, it's just a simplification :)
[14:07] mato	sustrik: i'll look into it.
[14:08] cremes	ptrb: was this helpful at all? http://www.zeromq.org/tutorials:xreq-and-xrep/
[14:08] cremes	if not, tell me what is unclear and i'll fix it (or edit the page yourself)
[14:08] ptrb	clicks
[14:08] sustrik	ah -- that's the one i meant
[14:08] cremes	i was certainly confused by that stuff which is why i wrote it down
[14:09] ptrb	yeah, it definitely helps
[14:09] ptrb	but ruby is definitely not the right language to use in examples like this
[14:09] ptrb	it's as opaque as perl without the benefit of perl's saturation
[14:09] cremes	ha
[14:10] mato	:-)
[14:45] CIA-20	zeromq2: 03Martin Sustrik 07maint * re2802d9 10/ src/options.cpp : values of RATE, RECOVERY_IVL and SWAP options are checked for negative values - http://bit.ly/9lAWvv
[14:46] pieterh	sustrik: nice to see the validation of option values
[14:47] sustrik	there are 3 bugs filled by cremes :)
[14:47] mato	sustrik: incidentally, there's one annoying thing regarding the option values
[14:48] sustrik	yes?
[14:48] mato	well, it's the whole "option types" issue
[14:48] sustrik	?
[14:48] mato	one thing that i realised the other day when i was writing some simple test cases
[14:48] mato	is that in order to use options sensibly from C, you have to have uint64_t / int64_t defined
[14:49] mato	now, obviously those are not defined by zmq.h
[14:49] mato	of course if you want to use the API you have to do "platform specific stuff" to get those types
[14:49] mato	not a good situation :-(
[14:50] mato	especially on windows where M$ does not ship stdint.h
[14:50] sustrik	i know
[14:50] sustrik	what can i do without breaking the backward compatibility?
[14:51] mato	probably nothing
[14:51] sustrik	:\|
[14:51] mato	hmm, maybe...
[14:52] mato	well, in theory you could define zmq_ namespaced equivalents of those types including the platform magic in zmq.h, but that's kind of ugly
[14:52] mato	and it'd probably still break C++ due to it's strict notions of types
[14:52] sustrik	it has to be solved properly
[14:53] sustrik	BSD-style
[14:53] mato	hmm
[14:53] sustrik	4-byte unsinged integer in network byte order
[14:53] mato	hang on
[14:53] sustrik	and such
[14:53] mato	what?
[14:53] mato	what does network byte order have to do with it? :)
[14:53] sustrik	that's what i do when usign BSD sockets
[14:53] mato	?
[14:54] sustrik	addr.port = htons (port);
[14:58] mato	sustrik: ah, you're right, i didn't realise e.g. sockaddr_in.sin_port was in network byte order
[14:58] mato	sustrik: the thing is, the setsockopt stuff is somewhat ad-hoc
[14:58] CIA-20	zeromq2: 03Martin Sustrik 07master * re2802d9 10/ src/options.cpp : values of RATE, RECOVERY_IVL and SWAP options are checked for negative values - http://bit.ly/9lAWvv
[14:58] CIA-20	zeromq2: 03Martin Sustrik 07master * rff10807 10/ src/options.cpp :
[14:58] CIA-20	zeromq2: Merge branch 'maint'
[14:58] CIA-20	zeromq2: * maint:
[14:58] CIA-20	zeromq2: values of RATE, RECOVERY_IVL and SWAP options are checked for negative values - http://bit.ly/9P99Fu
[14:58] mato	it does not use network byte order AFAIK
[14:59] mato	sustrik: anyway thinking about it, i guess zmq.h will have to define its own types
[15:00] mato	the main problem with that is doing it in a portable fashion :-(
[15:07] ptrb	#defines, #defines for everybody!!
[15:07] mato	sure, feel free to suggest how to portably define a 64-bit unsigned integer type :-)
[15:08] ptrb	you just take two ints and mash 'em together, obviously
[15:08] mato	:-)
[15:29] sustrik	mato: btw, there's some packaging PATCH on the mailing list
[15:29] sustrik	will you reply to that?
[15:30] mato	sustrik: yes, later, i know it's there
[15:30] mato	have to go now
[15:30] sustrik	ok, there's on in the bug tracker as well
[15:30] mato	sustrik: i'd have just applied it but the license stuff needs to be sorted out
[15:31] mato	else all i can reply is "please state your patch is licensed under..."
[15:31] mato	have discussed with pieter will update you this evening
[15:31] mato	or pieter will
[15:31] mato	bbl
[15:31] sustrik	should i ask for the license?
[15:31] mato	don't bother lets just fix it
[15:31] mato	talk to you in the evening
[15:31] sustrik	ok
[15:41] Samy	sustrik, just had some questions regarding zeromq's lock-less data structures.
[15:42] sustrik	yes?
[15:42] Samy	sustrik, I'm working on a library to help ease concurrent programming, it includes a plethora of concurrent data structures, synchronization methods and mechanisms for SMR (currently using hazard pointers, considering implementing RCU).
[15:43] Samy	sustrik, I was curious if there were some features you would really like to see, and some specific constrainted data structures (M:N consumer/producers) for zeromq.
[15:43] Samy	sustrik, I was also curious if SMR was important for ZeroMQ at all.
[15:44] sustrik	sorry, what's SMR?
[15:44] Samy	sustrik, safe memory reclamation.
[15:45] Samy	sustrik, usually it's important for unbunded lock-less data structures (simple stack being a great example).
[15:46] sustrik	does it translate to "make the current state of the memory visible to the other CPU codes"?
[15:47] sustrik	sorry, i am not an expert on lock-free algos
[15:48] Samy	sustrik, well, I guess fundamentally, it's more like "let other CPUs know I am not using this memory".
[15:48] sustrik	ah, ok, let me explain what 0mq is doing
[15:48] Samy	sustrik, some techniques will require a full barrier (RCU) while others do not (hazard pointers).
[15:48] Samy	sustrik, ok.
[15:48] sustrik	the communication always happens between exactly two endpoins
[15:49] sustrik	ie. at most 2 cpu cores
[15:49] sustrik	so each lock-free queue has exactly one writer and exactly one reader
[15:49] Samy	Ah, makes it much easier. :-)
[15:49] Samy	Sorry, let me pop open the source-code.
[15:49] sustrik	src/ypipe.hpp
[15:49] Samy	Thanks, loading.
[15:50] sustrik	what it does basically is that writer is appending new items to the linked list
[15:50] sustrik	the reader is reading items from the linked list
[15:50] Samy	I don't see any padding, sustrik.
[15:51] sustrik	what padding?
[15:51] Samy	Between w and r.
[15:51] Samy	sustrik, to prevent false cache line sharing, this improves concurrency.
[15:51] sustrik	that would be nice
[15:51] Samy	sustrik, it can be drastic. :-)
[15:51] Samy	Let me show you a simple example on this machine.
[15:52] sustrik	ok, so what has to be done is separate the variables that belong to the reader and those that belong to the writer
[15:53] sustrik	and keep them in different cachelines
[15:53] sustrik	right?
[15:53] keffo	the data too
[15:53] Samy	[sbahra@sbahra validate]$ time ./ck_fifo_spsc 8 1 100000
[15:53] Samy	real 0m5.294s
[15:53] Samy	[sbahra@sbahra validate]$ time ./ck_fifo_spsc 8 1 100000
[15:53] Samy	real 0m7.718s
[15:54] Samy	sustrik, the latter is without appropriate padding.
[15:54] sustrik	nice
[15:54] sustrik	the data are allocated in a contiguous block
[15:54] keffo	that was pretty substantial, what arch is that?
[15:54] sustrik	to avoid excessive memory allocation
[15:54] Samy	sustrik, that is a simple benchmark that creates a token ring using a single-producer/single-consumer lock-less queue (passes around 100000 across 8 threads some number of iterations)
[15:55] Samy	keffo, that is on a Nehalem box.
[15:55] keffo	aho
[15:55] keffo	sustrik, You should look into pooling, not only for that
[15:55] Samy	sustrik, you want the writer and reader variables to be on a seperate cache line.
[15:55] sustrik	right, i understand that
[15:55] Samy	sustrik, so for example, void reader; char pad[56]; void writer; ...
[15:56] Samy	sustrik, ok.
[15:56] sustrik	not sure how to do that in portable fashion though
[15:56] keffo	compile time :)
[15:56] keffo	macro hell
[15:56] ptrb	did someone say #define??
[15:56] ptrb	:D
[15:56] Samy	sustrik, not very portable (my library tries to make that portable by doing the hard work of generating those constants at compile-time).
[15:56] sustrik	:)
[15:56] Samy	sustrik, but modern IA32 and SPARCv9 boxes I know of all have 64 byte cache lines.
[15:56] sustrik	well, if we assumed the cache line size is 64 bytes
[15:56] keffo	cpuid fiddling works I guess :)
[15:57] sustrik	we would be right in most cases
[15:57] sustrik	no?
[15:57] keffo	all x86_64 is >64 no?
[15:57] sustrik	128?
[15:57] Samy	keffo, no.
[15:57] Samy	keffo, 64.
[15:57] Samy	keffo, when I mean IA32, I mean x86/x86_64.
[15:57] keffo	>=64 then :)
[15:57] Samy	I don't know of any > 64.
[15:58] sustrik	padding to 64 bytes seems reasonable imo
[15:58] keffo	I meant, x86_64 is presumably >=64, while x86 is not
[15:58] sustrik	that would be 64 bytes per reader and 64 bytes per writer
[15:59] sustrik	the shared varaible should probably be on a separate cahceline
[15:59] keffo	Samy, Have you tried measuring on itanium? :)
[15:59] sustrik	so it's 192 bytes per queue
[15:59] Samy	keffo, Itanium is very interesting, but barely used, so no.
[16:00] Samy	keffo, I would like to work on a port of my library there but I have had design issues, just recently did a major design change.
[16:00] keffo	Yeah, void of any cachemisses it should be
[16:00] Samy	sustrik, sorry, can you continue with your explanation of ypipe?
[16:00] sustrik	ah
[16:00] sustrik	ok, so part of the list belongs to the writer thread
[16:00] sustrik	part of it to the reader thread
[16:01] sustrik	the only point where the two interact is when reader has no more data to read
[16:01] sustrik	then it does CAS
[16:01] sustrik	to get the writer's portion of the list
[16:02] Samy	So, basically, you batch the operations?
[16:02] sustrik	exactly
[16:02] sustrik	that's the key
[16:02] Samy	So, writer does enqueue operations to the queue.
[16:02] sustrik	ack
[16:02] Samy	The reader occassionally does a batch dequeue.
[16:02] sustrik	ack
[16:02] Samy	Ok.
[16:02] Samy	sustrik, why do you use CAS for the batch dequeue? That can be implemented using an xchg, making the batch dequeue a wait-free operation.
[16:02] sustrik	it's pretty effective just because of the batching
[16:03] sustrik	hm, i haven't seen the code for a long time
[16:03] sustrik	let me have a look
[16:04] Samy	It looks like you do a single CAS.
[16:04] Samy	Line 137, check_read (I assume?)
[16:05] sustrik	"If there are no
[16:05] sustrik	// items to prefetch, set c to NULL"
[16:05] sustrik	Samy: yes
[16:05] sustrik	the cas is used to communicate back to the writer
[16:05] Samy	You can implement this without any lock-free operations.
[16:06] sustrik	you mean without atomic ops?
[16:06] sustrik	or bus locking?
[16:07] Samy	Without explicit atomic operations.
[16:07] Samy	sustrik, the idea is to share a stub node for both tail and head of the queue.
[16:08] sustrik	can you explain in more detail?
[16:08] Samy	sustrik, writer updates tail, always updating the next pointer, and reader always assume first node is stub entry.
[16:08] Samy	sustrik, I can show you source-code.
[16:08] Samy	Let me see if I find it, hold on.
[16:09] sustrik	ok
[16:09] sustrik	how does the communication between two cpu cores happens then?
[16:09] sustrik	cache coherency algos?
[16:11] Samy	sustrik, yes.
[16:11] Samy	sustrik, the point is, the only time we need to really share state is if the queue is empty.
[16:11] sustrik	ack
[16:11] Samy	sustrik, we need a way to detect if the queue is empty atomically and update both head and tail atomically.
[16:11] Samy	sustrik, by sharing a stub node, this is possible.
[16:12] sustrik	i think something like that is done in ypipe
[16:12] sustrik	there's an emty item in the list
[16:12] sustrik	that servers as a placeholder
[16:12] sustrik	that's what you mean by stub, right?
[16:12] Samy	Yes.
[16:12] Samy	sustrik, http://codepad.org/gtPkkz57
[16:13] Samy	This isn't fenced correctly, but that should be fine on IA32.
[16:14] sustrik	i thought that IA requires you to fence explicitely
[16:14] sustrik	ah, you mean x85
[16:14] sustrik	86
[16:14] Samy	Yes.
[16:15] Samy	The fences may be sufficient, I just haven't verified it on non-IA32 yet. :-)
[16:15] sustrik	ok, there's one more issue there
[16:15] sustrik	when reader finds out that there are no more items to read
[16:15] sustrik	it goes asleep
[16:16] sustrik	it's writers responsibity to wake it up
[16:16] sustrik	so the writer has to be informed about the fact that reader tried to get more items and failed
[16:16] Samy	sustrik, futex(2) works well.
[16:16] sustrik	yes, it's similar
[16:17] sustrik	but keep in mind that this is a multi-platform app
[16:17] Samy	Things sort of suck without futex. :-(
[16:18] Samy	(or a similar mechanism, at least in ring-3)
[16:18] sustrik	that's how it is :\|
[16:18] Samy	But you could abstract a generic CV layer that uses futex directly if available.
[16:18] Samy	sustrik, where is this wake-up mechanism implemented?
[16:18] sustrik	out of the class, it's a dumb socket pair
[16:19] sustrik	i had a futex implementation once
[16:19] Samy	What happened?
[16:19] sustrik	but it was pain in the ass to make it work everywhere
[16:19] Samy	I see.
[16:19] sustrik	some linux kernels pretend to have futexes
[16:19] sustrik	but astucally return ENOTSUP when you try to use them
[16:19] sustrik	and alike
[16:19] Samy	I see.
[16:20] sustrik	anyway, the wake up mechanism is irrelevant for this discussion
[16:20] Samy	Mostly, yes.
[16:20] sustrik	what's relevant is the writer has to be notified that reader is sleeping
[16:20] sustrik	that's what the second cas is for
[16:20] sustrik	in flush function
[16:21] Samy	I don't understand.
[16:21] Samy	Why not simply use plain loads and stores if this is single reader and consumer?
[16:22] sustrik	because one bit of information is passed the other way round
[16:22] sustrik	from reader to writer
[16:22] sustrik	"i am sleeping"
[16:22] sustrik	that's when c is set to NULL
[16:22] sustrik	maybe it can be done without atomic ops, i am not an expert
[16:23] sustrik	however, if it was a normal locking code, this would be a place where races can occur
[16:24] Samy	Ok.
[16:24] sustrik	so, reader tries to get the latest batch of the items
[16:24] sustrik	and if there are none, it sets c to NULL
[16:25] sustrik	in flush function, writer adds new batch of items and if it finds out that c was NULL previously
[16:25] sustrik	it knows the reader is sleeping
[16:25] sustrik	and that it should wake it up
[16:26] sustrik	my feeling is that the operations on c have to be atomic
[16:26] sustrik	not 100% sure though
[16:30] Samy	read_asleep = false; loop { old = xchg(top, NULL); if old == null then reader_asleep = true; sleep(); }
[16:31] Samy	Sorry, read_asleep is in loop.
[16:32] sustrik	that's the writer side?
[16:33] Samy	Reader.
[16:34] Samy	There is still a risk of spurious wake-up, but that isn't a big deal.
[16:34] Samy	If it is, then you basically implement a barrier.
[16:34] Samy	You have a reader and a writer shared variable. Writer signals reader once, only if reader has signaled writer since last wake-up.
[16:35] sustrik	yes, that's the goal
[16:36] Samy	And the algorithm.
[16:36] Samy	Me too.
[16:36] sustrik	can you give a pseudo-code for writer side as well?
[16:38] sustrik	in the code above reader_asleep is another shared variable, right?
[16:39] pieterh	sustrik: I was trying to say something about options but got interrupted by stuff...
[16:40] pieterh	API should IMO reject an identity starting with zero byte
[16:40] sustrik	now you are likely to get interrupted by brain-damaging lock-free algo discussion :)
[16:40] pieterh	:-/
[16:40] sustrik	yes, that would be nice
[16:41] Samy	reader: loop { r = false; store_fence; w = true; o = xchg(top, NULL); if old == null then { r = true; sleep(); } }
[16:41] Samy	writer: add(); c_w = w; load_fence; c_r = r; if c_r == true and c_w == true; then { c_w = false; signal_reader; }
[16:41] sustrik	pieter: fancy sumbitting a patch?
[16:41] pieterh	sustrik: I'm pulling master as we speak, yes I'll make you a patch
[16:42] sustrik	thanks
[16:42] sustrik	Samy: sorry, having 3 discussions in parallel
[16:43] Samy	Hey, no problem. :-)
[16:43] Samy	I'll be back in some minutes.
[16:43] sustrik	sure, checking your code right now
[16:46] sustrik	what's top?
[16:46] sustrik	which variable belong to whom?
[16:46] sustrik	both r and w seem to be shared
[16:49] pieterh	sustrik: np. options.cpp already checks that... :-)
[16:49] sustrik	good
[16:51] Samy	sustrik, yes, they are shared.
[16:52] Samy	sustrik, that statement is whatever condition you use to check if the queue is empty.
[16:53] sustrik	this one: o = xchg(top, NULL); ?
[16:53] Samy	That and next.
[16:53] Samy	o == null
[16:53] Samy	(not old)
[16:54] sustrik	do c_r and c_w are local to writer
[16:54] Samy	Yes.
[16:54] Samy	The reason we have 2 variables is that writer will signal the reader only once.
[16:55] Samy	The reader will then wake-up, and if the queue is empty it will indicate that it is asleep and it is fine for the writer to send another signal.
[16:56] Samy	sustrik, if you treat it like a stack, it will make more sense.
[16:56] sustrik	Samy: stack of what?
[16:56] Samy	sustrik, for a FIFO, you can use another technique.
[16:57] Samy	sustrik, of objects.
[16:57] sustrik	that's the case in 0MQ
[16:57] Samy	sustrik, this is in reference of xchg(top, ...) :)
[16:57] Samy	sustrik, what's of relevance to you is the notion of the r and w variables, that's all.
[16:58] sustrik	r = "i am sleeeping"
[16:58] sustrik	what abour w?
[16:58] sustrik	about*
[16:58] Samy	w = "reader has woken-up since the last signal"
[16:59] sustrik	soemging is missing
[16:59] sustrik	w is never set to false
[16:59] Samy	Oh, sorry.
[16:59] Samy	{ c_w = false; signal_reader; } should be { w = false; signal_reader; }
[17:00] sustrik	lk,now it makes sense
[17:00] sustrik	now, what about the fences
[17:01] sustrik	does it sync the data in the queue as well?
[17:02] Samy	You can figure that out.
[17:02] Samy	That is mainly meant for those flags.
[17:02] Samy	The idea, reader should always appear to be set to false before writer is set to true (it must indicate it has woken up before it indicates it is fine to send another signal).
[17:03] sustrik	got it
[17:04] sustrik	that part ensures that everything happens in nice step-lock fashion
[17:04] Samy	Right.
[17:04] sustrik	now few technical questions...
[17:04] Samy	I'm no expert, but ok.
[17:05] sustrik	what does lock;xchg() actually means
[17:05] sustrik	is there an implied barrier?
[17:05] Samy	On IA32, atomic operations have a total order across all processors.
[17:05] Samy	There is an implicit pipeline flush.
[17:06] Samy	There is an implied barrier.
[17:06] Samy	There is also no need for the lock prefix on xchg.
[17:06] Samy	xchg is guaranteed to be atomic (which is what also makes it expensive to use, if you don't need it).
[17:06] sustrik	in that case there's no global ordering, right?
[17:06] Samy	Usually, yes.
[17:07] Samy	add is not guaranteed to be atomic, lock add is.
[17:07] Samy	xchg is always atomic.
[17:07] sustrik	oh my
[17:07] Samy	It's expensive compared to simple loads and stores, that's for sure.
[17:07] sustrik	so there's an implied barrier on xchg, right?
[17:07] Samy	Yes.
[17:07] sustrik	then, in your code we have 2 barriers
[17:08] Samy	For more information, you can see Volume 3 of the Intel Architecture Manuals, Chapter 8.
[17:08] sustrik	isn't it better to use single cas with a barrier then?
[17:08] Samy	sustrik, you don't need any of those barriers for IA32. Stores are always seen in order.
[17:09] Samy	sustrik, CAS is expensive.
[17:09] Samy	sustrik, it's best to avoid the lock prefix completely if possible.
[17:09] sustrik	ack
[17:09] Samy	sustrik, atomic loads and stores do this.
[17:10] sustrik	atomic load & stores == have global ordering ?
[17:10] Samy	sustrik, in theory, CAS can provide infinite consensus. If you don't need infinite consensus, you might not need CAS at all.
[17:10] sustrik	anyway, i have to get my head around it
[17:11] sustrik	are you around somewhere?
[17:11] Samy	Processor ordering.
[17:11] sustrik	email, or so?
[17:11] Samy	sbahra@repnop.org
[17:11] Samy	sustrik, I'd love your feedback on this library I'm working on sometime.
[17:11] Samy	sustrik, what architectures does ZeroMQ support?
[17:12] sustrik	we've tested on x86, itanium, sparc, ppc
[17:12] sustrik	arm
[17:12] Samy	sustrik, if you see section 8.2.2 in Volume 3 of the IA32 manuals, you'll get a nice break-down of the memory ordering.
[17:12] Samy	sustrik, does the ZeroMQ project have regular access to such boxes?
[17:13] sustrik	we have an itanium and sparc box
[17:13] sustrik	no ppc
[17:13] Samy	Itanium, nice. :-)
[17:13] Samy	My PPC box died (Mac Mini) recently. Need to fix it.
[17:14] Samy	I sold my only decent SPARCv9 box, at work our SPARCv9 machines are tied to our build cluster.
[17:14] Samy	sustrik, if I could get accounts on those, I can port my library's atomic interface for them to support those.
[17:14] Samy	sustrik, might be useful, at least as a reference implementation of some structures.
[17:15] Samy	sustrik, ;]
[17:15] sustrik	our sparc box is extremely old and slow :)
[17:15] Samy	Itanium is fine too.
[17:16] Samy	Well, Itanium is what I really want access to.
[17:16] sustrik	but mato mumbled something about bringing some 14-core sparc
[17:16] sustrik	ah
[17:16] sustrik	i have to ask mato, i think there's some applications running on it
[17:16] keffo	sandy bridge ftw!
[17:16] Samy	sustrik, ok.
[17:17] Samy	I look forward to future discussions. I would like to show you the work I've done so far for feedback.
[17:17] Samy	It is C, not C++, however.
[17:17] sustrik	Samy, i can have a look but as you see i am not an expert :)
[17:18] Samy	Well, that's fine. The idea is, what would it take to make you use this? As a 3rd party looking to integrate this into their product, what issues do you have with the interface?
[17:18] Samy	etc
[17:18] sustrik	what's the link to the lib?
[17:18] kleppari	we're ditching a couple of sun t1000 boxes at work
[17:18] Samy	No link. Tarball, sustrik.
[17:18] Samy	When we have a discussion, I can share that.
[17:18] kleppari	let me see if I can 'rescue' on of them
[17:19] kleppari	s/on/one/
[17:19] sustrik	Samy: sustrik@250bpm.com
[17:20] Samy	Cool.
[17:20] Samy	I'm back to work, take care.
[17:20] sustrik	you too
[17:20] sustrik	bye
[17:20] sustrik	kleppari: i don't think it's really needed
[17:21] sustrik	unless you need it yourself
[17:21] kleppari	not really
[17:21] kleppari	kind of lost interest in solaris after oracle bought it
[17:21] kleppari	err, after oracle bought sun
[17:22] kleppari	I think they'll do a wonderful job killing it
[17:22] sustrik	yeah, looks like there will be a lot of sun boxes avialable :)
[17:22] kleppari	heh, yeah - red hat will make a fortune
[17:23] kleppari	but the t1000 was a good box, I still think 16 concurrent threads of execution per core is impressive
[17:24] kleppari	t2000, sorry
[17:29] pieterh	kleppari: we used to have a t2000 at iMatix when we were making OpenAMQ
[17:30] pieterh	it was a pretty impressive box, made a noise like an aircraft taking off
[17:30] pieterh	and was slower (all 32 cores or whatever) than a 2-core Athlon
[17:31] mato	pieterh: it still might be more useful than the old e4500 i can get for the project
[17:31] kleppari	slower at what?
[17:31] mato	pieterh: 0mq is a very different codebase from OpenAMQ
[17:32] pieterh	mato: perhaps, yes
[17:32] mato	kleppari: if you do get a chance to rescue one i think it'd be more interesting for development of the lockfree algorithms in 0mq than what i've been offered
[17:32] pieterh	kleppari: at raw I/O afaics
[17:32] pieterh	but the main difference was probably Linux vs. Solaris
[17:32] mato	kleppari: which is an old E4500 (fully spec'ed with 14 CPUs)
[17:32] kleppari	I found that these boxes perform pretty well with a huge runqueue
[17:32] kleppari	mato: let me see what I can do
[17:32] pieterh	they were designed for web services, indeed
[17:32] kleppari	mato: can't promise anything, though, but I'll try
[17:33] mato	kleppari: no hurry; also, where are you located? it's not worth shipping that stuff from outside of the EU, too expensive
[17:33] kleppari	iceland
[17:33] mato	lol :)
[17:33] kleppari	so the shipping might kill the deal :P
[17:33] mato	kleppari: precisely, unless you have friends at smyril line :-)
[17:33] mato	"chuck it down with them bananas" :-)
[17:34] kleppari	heheh
[17:35] kleppari	I don't think we'd be shipping bananas out of the country, but fair point :P
[17:36] pieterh	hey... ebay is offering free shipping or something...
[17:36] mato	kleppari: thanks for the offer in any case
[17:37] pieterh	0beer0clock
[17:37] mato	Ãbeer
[17:37] pieterh	zbeer
[17:37] mato	cyl
[17:37] pieterh	cyrsn
[17:38] mato	Ã¸ÃÃÃÃÃ¸Ã¸Ã¸ÃÃÃ¸Ã¸Ã!
[17:38] pieterh	:-) I knew it...
[17:39] pieterh	oh... hang on...
[17:43] kleppari	alt gr + o ?
[17:43] kleppari	works on the .is layout
[17:44] guido_g	and on 105 key .de (qwertz)
[17:44] guido_g	and I just saw that ch. 3 is back...
[17:44] pieterh	yeah, I fixed it
[17:46] pieterh	cyal, i'm off for zbeer
[17:47] pieterh	guido_g: there's also a new problem solver in Ch1
[17:47] guido_g	oh...
[17:50] guido_g	nice one
[17:50] guido_g	but can you do it in uml?
[17:51] guido_g	ducks for cover
[17:53] guido_g	more serious thing: smaller version that fits on one page so it can be printed easily
[18:04] keffo	sustrik, What scenario could trigger a deadlock? (in single threaded proc.)
[19:09] Samy	sustrik, ping?
[19:09] Samy	sustrik, http://codepad.org/u1DWN3FG is the correct version.
[19:27] ModusPwnens	Does zeromq have a lot of start up overhead of initialization?
[19:27] ModusPwnens	or initialization*
[19:36] ModusPwnens	the only reason I ask is because i keep getting strange results when I benchmark where the throughput is low intially, but then gradually increases