[Time] Name | Message |
[02:30] gdan
|
hello. trying to install on OSX. it says to intall pkg-config. does anyone know where the proper place to get pkg-config is?
|
[02:31] Steve-o
|
its not required for the 2.1.0 beta
|
[02:31] Steve-o
|
it's a part of GLib
|
[02:32] gdan
|
i'm installing the lastest stab release zeromq-2.0.10
|
[02:33] Steve-o
|
this should help: http://stackoverflow.com/questions/3522248/how-do-i-compile-jzmq-for-zeromq-on-osx
|
[02:33] gdan
|
thNKA
|
[02:33] gdan
|
THANKS
|
[02:38] Steve-o
|
gdan_: but I'd recommend the 2.1.0 as a lot of changes have ocurred
|
[02:42] gdan
|
appears to have worked. thanks. will test tomorrow :)
|
[03:19] cremes
|
gdan: with macports, install libtool and pkg-config; you can then install from github master on osx
|
[08:43] Evet
|
mongrel2 is really cool
|
[08:58] pieterh
|
sustrik: ping
|
[09:00] sustrik
|
pong
|
[09:05] pieterh
|
I've set up a link shortening service at zero.mq
|
[09:06] sustrik
|
what's that?
|
[09:06] pieterh
|
e.g. http://zero.mq/tips
|
[09:06] pieterh
|
jumps to a long URL on zeromq.org somewhere
|
[09:06] pieterh
|
useful for twitter and such
|
[09:07] pieterh
|
:-) you've got an invitation to join the zero.mq site
|
[09:07] pieterh
|
accept it
|
[09:08] pieterh
|
then when you want a shortcut, go to that page (e.g. zero.mq/sustrik)
|
[09:08] pieterh
|
edit, save
|
[09:09] sustrik
|
aha
|
[09:09] sustrik
|
thanks
|
[09:10] pieterh
|
so btw if you have any critique on http://zero.mq/osdp, let me have it...
|
[09:10] sustrik
|
i've just read it
|
[09:10] sustrik
|
two points maybe
|
[09:10] sustrik
|
it looks like a design for stock quote ticker plant
|
[09:10] sustrik
|
but it claims that it's more generic than that
|
[09:11] sustrik
|
i am not sure it would fit any other use case
|
[09:11] sustrik
|
second, if you really want to launch the project
|
[09:11] sustrik
|
i would suggest creating a new site for it
|
[09:12] sustrik
|
otherwise there'll be confusion about layering
|
[09:12] sustrik
|
third: i like the name
|
[09:12] sustrik
|
ah
|
[09:12] sustrik
|
one more
|
[09:13] sustrik
|
the paper suggests that the only pattern used in ticker plant is pub-sub
|
[09:13] sustrik
|
in real world you'll need parallelised pipelines as well
|
[09:13] sustrik
|
for data processing after the data is delivered to the right place
|
[09:13] pieterh
|
hang on, visitor...
|
[09:13] pieterh
|
brb, :-)
|
[09:29] mikko
|
good morning
|
[09:42] sustrik
|
morning
|
[10:08] mikko
|
heheh
|
[10:08] mikko
|
i will give test this
|
[10:11] sustrik
|
i would guess the release build forgets to define ZMQ_HAVE_OPENPGM
|
[10:12] sustrik
|
if the problem was with building openPGM as such the build would fail, which is not the case
|
[10:12] mikko
|
hmmm
|
[10:21] mikko
|
sustrik: took a look at the project template
|
[10:21] mikko
|
interesting as i dont understand how the debug build has openpgm set
|
[10:21] pieterh
|
sustrik: re
|
[10:22] pieterh
|
mikko: let me take a look
|
[10:22] mikko
|
the occurrance libzmq/libzmq.vcproj: PreprocessorDefinitions="ZMQ_HAVE_OPENPGM"
|
[10:22] mikko
|
seems to be under the WithOpenPGM build configuration
|
[10:22] sustrik
|
so there's Dubeg/Release/WithOpenPGM currently?
|
[10:22] mikko
|
as far as a i understood we have never built openpgm with msvc by default
|
[10:22] mikko
|
sustrik: yes
|
[10:23] mikko
|
WithOpenPGM is a release build
|
[10:23] sustrik
|
right
|
[10:23] mikko
|
it uses external openpgm
|
[10:23] mikko
|
maybe we should build openpgm by default
|
[10:24] sustrik
|
maybe the guy messed up his executables
|
[10:24] sustrik
|
using withopenpgm instead of debug
|
[10:24] mikko
|
msvc is such a pain
|
[10:24] sustrik
|
it is :(
|
[10:24] mikko
|
i wish they had "build autotools project" as an option
|
[10:25] sustrik
|
unlikely to ever happen
|
[10:25] pieterh
|
we could always switch to cmake... :-)
|
[10:26] sustrik
|
we could, but still, you'll have to maintain 2 build systems
|
[10:26] sustrik
|
and i am not sure whethr win devs would appreciate that kind of thing
|
[10:27] pieterh
|
other option: generate the autotooling, generate the msvc projects from some higher-level abstraction
|
[10:28] sustrik
|
have you seen the autotools source?
|
[10:29] sustrik
|
it's an intricate maze of different options and their combinations
|
[10:29] sustrik
|
hm
|
[10:29] pieterh
|
you don't need to navigate/understand that maze to *generate* it
|
[10:30] pieterh
|
all you need are good templates with sections where we insert the variable stuff
|
[10:30] pieterh
|
imho
|
[10:30] sustrik
|
easy way would be to generate just the source file list
|
[10:30] pieterh
|
for instance
|
[10:31] pieterh
|
well, you know how powerful code generation can be when it has to be
|
[10:31] sustrik
|
yes
|
[10:31] sustrik
|
but it's adding one more layer of complexity to the build
|
[10:32] pieterh
|
not to the build, but to the tooling *we* use to maintain the build
|
[10:32] pieterh
|
if you want less pain working with details, you always need abstraction
|
[10:33] sustrik
|
unless the abstraction is more complex than the details themselves
|
[10:33] sustrik
|
which is likely the case here
|
[10:33] sustrik
|
anyway, code generation is not the problem here
|
[10:33] pieterh
|
so continue to complain about MSVC :-)
|
[10:33] sustrik
|
the problem is that msvc is nasty
|
[10:33] pieterh
|
autotools by contrast is... ... nice?
|
[10:33] pieterh
|
all build systems are nasty
|
[10:33] sustrik
|
nasty as well
|
[10:34] sustrik
|
you just have to deal with the nastyness
|
[10:34] sustrik
|
either in code generator or in projects files
|
[10:34] sustrik
|
you can't escape that
|
[10:34] pieterh
|
I know you do not like code generation but to generalize that as "complex and nasty" is just wrong
|
[10:34] pieterh
|
I can dismiss any abstraction like that
|
[10:34] pieterh
|
C++ is complex and nasty
|
[10:35] sustrik
|
sure
|
[10:35] pieterh
|
messaging is complex and nasty
|
[10:35] sustrik
|
the problem is that we have two nasty build tools
|
[10:35] sustrik
|
we have to deal with
|
[10:35] pieterh
|
you don't have to deal with, you choose to deal with
|
[10:35] pieterh
|
especially since you reject the alternatives without even studying them
|
[10:35] sustrik
|
CMake?
|
[10:36] pieterh
|
have we actually even tried anything else?
|
[10:36] sustrik
|
nope
|
[10:36] pieterh
|
even for a simple test case?
|
[10:36] sustrik
|
well, CMake
|
[10:36] pieterh
|
all I hear is "autotools is what everyone uses"
|
[10:36] sustrik
|
long time ago
|
[10:36] pieterh
|
and "everything else is crap anyhow"
|
[10:36] pieterh
|
that's really scientific
|
[10:36] sustrik
|
it was contrained to 2-3 platforms
|
[10:37] pieterh
|
it's only one of several plausible alternatives
|
[10:37] sustrik
|
and nobody actually bothered to make it as universal as autotools build
|
[10:37] pieterh
|
i'm getting repetitive, but many aspects of our process seem to be based on closed minds and opinion, not evidence
|
[10:37] sustrik
|
i think nobody has resource to fully convert the existing autobuild solution into something else
|
[10:38] pieterh
|
of course not
|
[10:38] sustrik
|
so, if you try
|
[10:38] pieterh
|
especially when the likely outcome is "well, thanks, but we'll stick to our nasty tooling"
|
[10:38] sustrik
|
you'll end up with maintaining both autotools and the new system
|
[10:38] pieterh
|
no-one is going to waste their time
|
[10:39] sustrik
|
mikko: what's your opinion?
|
[10:39] pieterh
|
note that I've successfully built massive systems like OpenAMQ on dozens of platforms with trivial maintenance effort
|
[10:39] sustrik
|
i would say it would take couple of manyears to convert to a different build tool
|
[10:39] pieterh
|
lol
|
[10:39] pieterh
|
that's your standard estimate when you don't like something
|
[10:39] pieterh
|
it's either "two days of work" or "two man years"
|
[10:40] pieterh
|
:-)
|
[10:40] sustrik
|
you are free to try
|
[10:40] pieterh
|
I would, if there was even a small chance my time would not be wasted
|
[10:40] sustrik
|
there is
|
[10:40] sustrik
|
once you have a solution that covers all the cases the existing solution does
|
[10:40] pieterh
|
not until there is actual consensus that we want to improve this
|
[10:41] pieterh
|
that's not how it works
|
[10:41] sustrik
|
unless you do that, we can't drop autotools
|
[10:41] sustrik
|
thus we would have to maintain 3 build systems instead of 2
|
[10:41] pieterh
|
if I come tomorrow with a full solution, there's 80% chance it'll be rejected on the grounds of "we don't know it, it's nasty, and too complex, and not standard"
|
[10:41] pieterh
|
we've been through that several times
|
[10:42] pieterh
|
there has to be consensus, up front, that we are in fact open to change
|
[10:42] sustrik
|
the question would rather be "does it build on HP-UX, version from 1992?"
|
[10:42] pieterh
|
well, does it?
|
[10:42] pieterh
|
openamq does
|
[10:42] sustrik
|
does it cross-compile from osx to mips?
|
[10:42] sustrik
|
an similar
|
[10:42] sustrik
|
that's the value that's already stored in current build system
|
[10:43] pieterh
|
again, there are several options here
|
[10:43] sustrik
|
if we want to change we have to find out how to extract the value
|
[10:43] pieterh
|
one would be to throw out that value and start from scratch
|
[10:43] sustrik
|
:(
|
[10:43] pieterh
|
two would be to layer an *abstraction* on top that reduces the actual pain of maintenance
|
[10:43] pieterh
|
you dismissed that out of hand
|
[10:43] sustrik
|
how would you do that?
|
[10:44] pieterh
|
so I'm going back to open source data plants
|
[10:44] pieterh
|
code generation, of course
|
[10:44] pieterh
|
it is a standard problem
|
[10:44] pieterh
|
you have a mix of complex code you very rarely want to touch
|
[10:44] pieterh
|
and simple code you have to maintain almost daily
|
[10:44] pieterh
|
that is why it's painful
|
[10:45] sustrik
|
ok, there are 2 parts to the build system
|
[10:45] pieterh
|
'add this new source to the project' is a major effort today
|
[10:45] sustrik
|
the often used part is the list of files
|
[10:45] pieterh
|
it means knowing about, and editing, dozens of highly complex sources
|
[10:45] sustrik
|
that's easy to maintain even now
|
[10:45] pieterh
|
msvc project for example
|
[10:45] sustrik
|
then there's "how to build it part"
|
[10:45] pieterh
|
totally separate
|
[10:46] pieterh
|
as long as you deliver MSVC developers the right projects, it's click and build
|
[10:46] pieterh
|
*obviously* you want to offer Linux users autogen/configure/make
|
[10:46] sustrik
|
yes, sure, so you can generate the lists of files
|
[10:46] pieterh
|
how many people understand the tools used to generate man pages?
|
[10:46] sustrik
|
that would save me a minute every now and then
|
[10:47] pieterh
|
yet we can all type 'man zmq'
|
[10:47] sustrik
|
the hard part that started this discussion though was not the easy "file list" part
|
[10:47] sustrik
|
it was the intricacies of msvc and autotools
|
[10:48] sustrik
|
which you have to deal with in code generator anyway
|
[10:48] pieterh
|
one problem is that you're hand-crafting solutions like "how do I integrate package X into my product"
|
[10:48] mikko
|
sorry, let me catch up with backlog
|
[10:48] sustrik
|
anyway, mikko, is build system maintainer, so it's up to him
|
[10:49] pieterh
|
sustrik: if you'd said that 20 minutes ago, I'd have had my coffee by now :-)
|
[10:49] mikko
|
i think rather than generating something from a definition i would say we need to validate
|
[10:49] pieterh
|
ok, mikko, sustrik, I have a proposal
|
[10:50] mikko
|
i think creating a file which checks preprocessor definitions on that platform with that compiler and errors out if they are not there
|
[10:50] pieterh
|
would you agree that the zmq builds are a more complex form of the zfl builds?
|
[10:50] mikko
|
pieterh: yes
|
[10:50] mikko
|
in a way
|
[10:50] pieterh
|
so since mikko and I are maintaining ZFL, we'll start by improving that
|
[10:51] pieterh
|
what I mean specifically
|
[10:51] pieterh
|
is that I will try to convince Mikko that code generation will make this magically simple
|
[10:51] pieterh
|
if this works for ZFL, and Mikko is convinced, we have a way to apply that to ZMQ
|
[10:51] mikko
|
cmake is trying to be this
|
[10:51] mikko
|
as far as a i understand
|
[10:52] pieterh
|
yes, but cmake is not pieter
|
[10:52] pieterh
|
there is a real difference
|
[10:52] pieterh
|
I've been making portable build systems since 1985 or so
|
[10:52] sustrik
|
don't fail pray to NIH syndrome though
|
[10:52] pieterh
|
my key techniques are ignorance and laziness
|
[10:53] mikko
|
the problem with build systems is that the formats and approaches are completely different
|
[10:53] mikko
|
you need to hack the end format converters for every build system
|
[10:53] pieterh
|
mikko: let's not philosophize over details
|
[10:54] mikko
|
i see this in a way that rather than generate build systems we need to be able to validate them
|
[10:54] mikko
|
this should be simple with a header file that errors if preprocessor definition is missing or incorrect ones are set
|
[10:56] pieterh
|
mikko: will you allow me to try some stuff and show you?
|
[10:56] pieterh
|
it'll take me a long weekend to put together a working toolchain for ZFL
|
[10:56] mikko
|
pieterh: of course
|
[10:56] mikko
|
im not going to stop you
|
[10:56] pieterh
|
we can add backends to e.g. generate cmake files and other stuff after
|
[10:56] pieterh
|
ok...
|
[10:57] mikko
|
but when you say that this is the cave-eat
|
[10:57] pieterh
|
there's no catch, if you don't like it, we chuck it out
|
[10:57] mikko
|
ok
|
[10:57] mikko
|
works for me
|
[10:57] pieterh
|
well, there is a catch
|
[10:57] mikko
|
haha
|
[10:57] mikko
|
i knew it
|
[10:57] pieterh
|
kind of like selling your soul to the devil
|
[10:57] pieterh
|
you get a tooling dependency but it's no worse than what we have today
|
[10:58] mikko
|
tooling dependency to our own tool?
|
[10:58] pieterh
|
hey, I got a pull request for zguide!
|
[10:58] pieterh
|
yay!
|
[10:58] mikko
|
if it's our tool there is really no hard depedency
|
[10:58] mikko
|
assuming it's abstract we can add more "backends"
|
[10:58] pieterh
|
so mikko, what's the weather like there?
|
[10:59] mikko
|
im still not convinced that you can represent all these build systems in abstract way but i am open to be amazed
|
[10:59] mikko
|
gray
|
[10:59] pieterh
|
hehe
|
[10:59] mikko
|
london isn't the weather capitol of europe
|
[11:00] pieterh
|
usually a bit nicer than Brussels IME but yeah...
|
[11:01] pieterh
|
So my idea for the Guide is to produce translated versions automatically
|
[11:01] pieterh
|
I.e. a C++ version, a Java version, etc.
|
[11:02] pieterh
|
with all examples in that language
|
[11:02] mikko
|
docbook is a fairly portable format for that
|
[11:02] mikko
|
php documentation is generated from docbook
|
[11:02] pieterh
|
oh, mikko, I have the tooling already
|
[11:02] mikko
|
hahaha
|
[11:02] mikko
|
let me guess, you wrote them?
|
[11:02] pieterh
|
this is just pulling in examples/PHP instead of examples/C
|
[11:02] pieterh
|
have you seen gitdown?
|
[11:03] mikko
|
the markdown format?
|
[11:04] pieterh
|
https://github.com/imatix/gitdown
|
[11:05] mikko
|
ok
|
[11:05] mikko
|
remember to update copyright year
|
[11:05] pieterh
|
that's the basic tooling for the Guide, though that uses an older hacked version
|
[11:05] mikko
|
Copyright (c) 2010 Pieter Hintjens Copyright (c) 1996-2010 iMatix Corporation
|
[11:06] pieterh
|
fixed
|
[11:06] pieterh
|
so the idea is to take a simple text file with images as ditaa text
|
[11:07] pieterh
|
and mangle into some markup - for the Guide, it's wikidot markup
|
[11:07] pieterh
|
part of that is merging in code from the examples directory
|
[11:07] pieterh
|
so it's easy to make, e.g., a PHP version of the Guide in parallel with teh current C version
|
[11:07] pieterh
|
my idea is to do that only for languages that hit 75% translation
|
[11:11] pieterh
|
sustrik: thanks for the feedback on the OSDP
|
[11:12] pieterh
|
indeed, it'll be a separate site and project
|
[11:17] pieterh
|
mikko: question
|
[11:18] pieterh
|
why do we build the cJSON object, in zfl/src?
|
[11:22] pieterh
|
rather, what was the benefit of changing the way cJSON was imported?
|
[11:24] sustrik
|
pieterh: you are being cited here:
|
[11:24] sustrik
|
http://it.toolbox.com/blogs/open-source-smb/whats-the-future-of-amqp-44450
|
[11:38] mikko
|
pieterh: to automatically include in make dist
|
[11:38] mikko
|
it can be added as a no-build source as well
|
[11:39] pieterh
|
sustrik: I answered that, but it's a really annoying blog to post to
|
[11:39] pieterh
|
mikko: I'd rather it's a no-build source...
|
[11:39] pieterh
|
the goal was to hide that totally in zfl_config_json, not export it as an object
|
[11:39] pieterh
|
otherwise apps will use it
|
[11:40] mikko
|
pieterh: i'll make that change
|
[11:40] pieterh
|
i appreciate it!
|
[11:40] mikko
|
need to quickly finish this excel
|
[11:40] pieterh
|
no hurry, no hurry
|
[11:49] pieterh
|
sustrik: I've made a comment or two on that post
|
[12:50] Evet
|
are there obvious performance differences of zeromq between windows and linux?
|
[12:52] mikko
|
Evet: not obvious
|
[12:52] mikko
|
that i am aware of
|
[12:53] Steve-o
|
well IPC and PGM are slower
|
[12:53] mikko
|
optimizing things for windows seems to be very much a dark art
|
[12:53] mikko
|
ipc doesnt exist
|
[12:53] mikko
|
well, not in zeromq
|
[12:53] mikko
|
Steve-o: thanks for applying that
|
[12:53] mikko
|
Steve-o: i'll test freebsd next
|
[12:55] Steve-o
|
no worries, just unit testing on Solaris
|
[12:57] Steve-o
|
Linux passes now, FreeBSD built, Solaris built, Mingw to go for unit tests on Windows
|
[13:03] Evet
|
Steve-o: do you test it on virtualbox, etc?
|
[13:04] Steve-o
|
real hardware, vbox hasn't got multicast support I think
|
[13:10] Evet
|
Steve-o: do you compare zeromq performance, stability among linux, *bsd, solaris?
|
[13:11] Steve-o
|
and Windows
|
[13:11] Steve-o
|
but I don't have high end SPARC hardware for a fair comparison
|
[13:12] Steve-o
|
unsurprisingly its linux > freebsd > solaris > windows
|
[13:13] Steve-o
|
windows only works well when you are not using it, i.e. offloading into hardware
|
[13:17] sustrik
|
lol
|
[13:17] sustrik
|
"windows only works well when you are not using it"
|
[13:17] Steve-o
|
which is why TCP is pretty good on it :)
|
[13:29] pieterh
|
Evet: in my experience Windows is about 3-5 times slower on the same hardware, for TCP
|
[13:29] Evet
|
its sad that i have two windows servers
|
[13:30] pieterh
|
one is accidental, two is bad luck, three would be unforgiveable :-)
|
[13:31] pieterh
|
linux is the focus of intense optimization you don't see elsewhere
|
[13:31] Evet
|
low-level stuff is pain in the ass on windows
|
[13:32] pieterh
|
You get lots of choice though
|
[13:32] Evet
|
i had to reboot the server 2 times just for a simple dns server, one for installation and one for updates
|
[13:32] pieterh
|
like, pthreads offers just one way to start a thread... pah!
|
[13:32] guido_g
|
and everything you don't habe a button for is considered low-level
|
[13:32] guido_g
|
*have
|
[13:32] pieterh
|
win32 offers at least three
|
[13:32] pieterh
|
of which two are marked "Don't use this! Will Crash!"
|
[13:33] pieterh
|
it's like no API call was ever improved... it was written (poorly), delivered, and then frozen
|
[13:35] Steve-o
|
WSAPoll is quite amusing
|
[13:35] Steve-o
|
they admit its slower than select()
|
[13:39] Steve-o
|
then they silently advertise winsock is stupid and slow by implementing wsk sockets
|
[13:40] pieterh
|
Evet: really, if you can use Linux for your production boxes, you're winning big time
|
[13:43] Steve-o
|
well, unless you're employing .net developers I guess
|
[13:43] Evet
|
im planning to chuck out windows and plesk license of 24gigs box and run mongrel2+zeromq only
|
[13:43] pieterh
|
Steve-o: even then, I'd use that only for client boxes that absolutely had to use Windows, e.g. to run Excel
|
[13:44] pieterh
|
you don't want production platforms where the tool chain becomes obsolete every 18 months
|
[13:47] Steve-o
|
I guess at places they're changing developers every 6-12 months it's not so important :D
|
[13:52] Steve-o
|
anyway but a lot of the blame on Windows is actually with the drivers
|
[14:05] Evet
|
is 2.1.1 (rc1) production ready?
|
[14:07] pieterh
|
Evet: it's a release candidate
|
[14:07] pieterh
|
that means we're expecting to fix some issues before it's production ready
|
[14:44] mikko
|
hey
|
[14:44] mikko
|
nm
|
[14:49] ianbarber
|
good chat
|
[15:00] cremes
|
pieterh: i felt a challenge yesterday to convert my ruby "leak" example to C; here's the result: https://gist.github.com/841318
|
[15:01] cremes
|
it shows something leaking like mad... i don't know if it's 0mq, the OS or what (tested on osx & linux)
|
[15:01] cremes
|
but now you don't have to do the conversion!
|
[15:02] cremes
|
i find that the examples in the guide are useful foundations for converting code from other langs into C
|
[15:02] cremes
|
so thanks for that
|
[15:20] Evet
|
mongrel2 is what i was trying to do with nginx. its really cool
|
[15:59] CIA-21
|
zeromq2: 03Martin Sustrik 07master * rc22e527 10/ doc/zmq_getsockopt.txt :
|
[15:59] CIA-21
|
zeromq2: Minor patch to zmq_getsockopt(3) man page
|
[15:59] CIA-21
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/gXYb3A
|
[16:11] gdan
|
on linux i built and installed libzmq, tested (mono app) and all was good. on OSX i built, installed (all seemed to go ok) but my app (mono) does not see the libzmq lib. has anyone had experience with zmq mono on OSX?
|
[16:14] gdan
|
i'm wondering if there is an env path missing from the mono project, for example LD_LIBRARY_PATH in linux
|
[16:14] amacleod
|
OSX is unix too. LD_LIBRARY_PATH should work there the same as it does on Linux.
|
[16:15] amacleod
|
s/unix/unix or unix-like/
|
[16:20] gdan
|
i read that DYLD_LIBRARY_PATH replaces.... don't know. going to have to do more reading as still not seeing the lib. thanks
|
[16:22] Seta00
|
anyone here familiar with clrzmq2?
|
[16:24] pieterh
|
amacleod: Microsoft, I suspect :-)
|
[16:24] pieterh
|
cremes: nice, I was going to do it but it's 5:30pm and I still didn't get around to it
|
[16:26] cremes
|
pieterh: i haven't really written any C in 15 years so it was kind of nice to dust off those old skills
|
[16:26] pieterh
|
it looks purdy
|
[16:26] cremes
|
but the real question is this... have you run the code to confirm the problem?
|
[16:26] cremes
|
heh
|
[16:26] pieterh
|
i'm trying it now...
|
[16:28] pieterh
|
so, as you wrote it, memory slowly increases
|
[16:28] pieterh
|
now let me remove some possible culprits...
|
[16:28] cremes
|
feel free... i'd love to be proven wrong on this one
|
[16:29] cremes
|
this is the 'bug' i've been chasing for 2.5 weeks... it was masked by that identity issue (which was a real bug) and slowed me down from honing
|
[16:29] cremes
|
in on the real culprit
|
[16:31] pieterh
|
ok, so with a few lines changed, it works nicely
|
[16:31] pieterh
|
some questions
|
[16:31] pieterh
|
is there a good reason you open and close sockets all the time?
|
[16:31] cremes
|
yes, i wrote up my use-case in the issue/ticket
|
[16:31] cremes
|
it's easier to read it there than for me to recount it here
|
[16:32] cremes
|
https://github.com/zeromq/zeromq2/issues#issue/171
|
[16:32] pieterh
|
yes, I've read it
|
[16:32] pieterh
|
is there a *good* reason you open and close sockets all the time
|
[16:33] pieterh
|
rather, than, for example, starting your subscribers as inproc clients of a framework that just keeps one socket open all the time
|
[16:33] pieterh
|
I'm sure there are other ways
|
[16:33] cremes
|
what isn't *good* about that use-case? i haven't been able to figure out an alternate methodology
|
[16:34] pieterh
|
as you run your test, in C, go to another window
|
[16:34] cremes
|
yeah, it will have hundreds or thousands of socket in TIME_WAIT
|
[16:35] mikko
|
hacked together a mash up
|
[16:35] mikko
|
http://valokuva.org/~mikko/where.php?user=zeromq&repo=zeromq2
|
[16:35] mikko
|
shows pull requests by location
|
[16:35] pieterh
|
and then type 'while true; do netstat -a | wc -l; sleep 1; done'
|
[16:35] pieterh
|
sockets do not disappear right away
|
[16:35] pieterh
|
you are simply smashing the TCP stack
|
[16:35] pieterh
|
it's not 0MQ's issue
|
[16:35] cremes
|
pieterh: not true
|
[16:35] pieterh
|
well, netstat doesn't even respond here, anymore...
|
[16:36] cremes
|
this is easily illustrated by injecting a forwarder between the publisher and subscribers
|
[16:36] pieterh
|
ok, another question
|
[16:36] cremes
|
you can then kill the publisher and the forwarder's memory footprint never shrinks
|
[16:36] pieterh
|
cremes: that's because the sockets are owned by someone
|
[16:36] pieterh
|
anyhow, that's not your test case...
|
[16:36] cremes
|
but when you kill the publisher, they all go away within a few seconds
|
[16:37] pieterh
|
they = ?
|
[16:37] cremes
|
my prod code only has about 250 sockets in TIME_WAIT at any given time...
|
[16:37] pieterh
|
ok
|
[16:37] cremes
|
the example code was written to show the problem faster
|
[16:37] cremes
|
they = sockets in TIME_WAIT
|
[16:37] pieterh
|
ok, let me post an example that works perfectly
|
[16:37] pieterh
|
your code with a trivial change
|
[16:38] cremes
|
ok
|
[16:38] cremes
|
as an aside, this problem also occurs with ipc transport so i don't think it's a tcp issue
|
[16:38] pieterh
|
https://gist.github.com/842407
|
[16:38] pieterh
|
ipc = local domain sockets
|
[16:38] cremes
|
yeah, different stack
|
[16:39] pieterh
|
I do get:
|
[16:39] pieterh
|
Too many open files
|
[16:39] pieterh
|
rc == 0 (mailbox.cpp:374)
|
[16:39] pieterh
|
Aborted (core dumped)
|
[16:39] pieterh
|
your use case is just being really unpleasant with system resources
|
[16:41] cremes
|
your example with the context changes still leaks like a sieve on my box
|
[16:42] pieterh
|
on mine it doesn't even show on top, I suspect because CPU usage is too low
|
[16:43] cremes
|
change the code to use fewer threads (2 or 3) and change the client to reconnect once per second
|
[16:43] cremes
|
it will still leak
|
[16:43] pieterh
|
ok, trying that
|
[16:44] cremes
|
it will just take longer for it to be obvious it's growing
|
[16:45] pieterh
|
yes, it's definitely growing
|
[16:46] pieterh
|
I'll make the messages huge, see if it's related to that or not
|
[16:46] cremes
|
nope, it's not but please do the experiment :)
|
[16:46] pieterh
|
it's not, you tried, fine
|
[16:47] pieterh
|
next, try with 2.0.10...
|
[16:47] cremes
|
i'll have to dload a copy... give me a few minutes to get that setup
|
[16:47] andrewvc
|
cremes: I'm thinking of releasing a patch version of ffi-rzmq. Sound good?
|
[16:47] pieterh
|
i'll do it, have that already set-up...
|
[16:47] cremes
|
pieterh: ok
|
[16:47] andrewvc
|
the noblock thing is prolly worth it I'm thinking?
|
[16:48] cremes
|
andrewvc: i've made a few more nips & tucks and added some specs
|
[16:48] cremes
|
why don't you commit what you have and i'll merge my stuff in?
|
[16:48] andrewvc
|
hmm, I don't have anything new to add in
|
[16:49] cremes
|
oh, so it's just the fix for the #noblock? method?
|
[16:49] cremes
|
if so, i'll cut a release later today
|
[16:49] andrewvc
|
cool
|
[16:49] andrewvc
|
btw, did you see https://github.com/chuckremes/ffi-rzmq/issues#issue/20
|
[16:49] cremes
|
yep, saw it
|
[16:50] cremes
|
do you see EBADF in that circumstance?
|
[16:50] andrewvc
|
nah, I'm not sure if sustrik thought that was related to my other issue
|
[16:51] andrewvc
|
which tmm1 fixed last night in eventmachine
|
[16:51] andrewvc
|
it does bring up one good point perhaps. I don't think we invalidate sockets on term
|
[16:51] andrewvc
|
which is good
|
[16:52] andrewvc
|
maybe we should cut out this line though
|
[16:52] andrewvc
|
https://github.com/chuckremes/ffi-rzmq/blob/master/lib/ffi-rzmq/context.rb#L63
|
[16:53] andrewvc
|
not that it probably matters at all, but it might make sense given sustrik's comment
|
[16:53] cremes
|
hmmm... i would need to see some code that breaks because of that line before removing it
|
[16:53] andrewvc
|
yeah, I don't think it matters at all, I'm not certain we have any issue in the first place
|
[16:53] sustrik
|
hi, what comment?
|
[16:54] sustrik
|
the one in the email?
|
[16:54] andrewvc
|
\or the user to close the socket so you should not see EBADF in such case if you do, it's a bug
|
[16:54] andrewvc
|
ack
|
[16:54] andrewvc
|
I posted it here: https://github.com/chuckremes/ffi-rzmq/issues#issue/20
|
[16:54] andrewvc
|
did you see a specific bug sustrik?
|
[16:55] andrewvc
|
As far as I know calling term in ffi-rzmq doesn't touch socket objects
|
[16:55] sustrik
|
i'm lost
|
[16:55] sustrik
|
what are you speaking about
|
[16:55] sustrik
|
?
|
[16:55] andrewvc
|
in IRC the other day you mentioned
|
[16:55] andrewvc
|
in 2.1, zmq_term() should not invalidate the socekts what it does it causes the socket to return ETERM on any subsequent call and wait for the user to close the socket so you should not see EBADF in such case if you do, it's a bug
|
[16:55] andrewvc
|
lol
|
[16:56] andrewvc
|
I thought you'd directed that at me sustri
|
[16:56] andrewvc
|
this was at like 11:30PM I was really tired and confused
|
[16:56] andrewvc
|
lol
|
[16:56] sustrik
|
nope, it's just describing how it works
|
[16:56] sustrik
|
EBADF just should not happen
|
[16:57] andrewvc
|
ohhh, yeah, I guess at that point I hadn't apprised you of the situation that it was caused by an external process closing the FD, not a call to ZMQ::Term
|
[16:57] sustrik
|
i guess so
|
[16:58] pieterh
|
cremes: well, on 2.0.10 it has the same problems
|
[16:58] pieterh
|
it behaves far worse with TCP than inproc
|
[16:59] cremes
|
pieterh: yuck
|
[16:59] cremes
|
andrewvc: i'm going to strip that line out; if you look closer you'll see nothing is ever added to that array... dead code!
|
[16:59] andrewvc
|
oh nice
|
[17:00] andrewvc
|
ok, well, off to work, adios
|
[17:00] cremes
|
pieterh: more information...
|
[17:00] cremes
|
i tried running this code under java/jvm so that i could strictly control the heap size
|
[17:01] cremes
|
it's interesting in that the rsize grows well beyond the max heap size and the jvm doesn't start throwing outofmemory errors
|
[17:01] pieterh
|
cremes: it also consumes a heck of a lot of CPU for a process that does essentially nothing...
|
[17:01] cremes
|
this indicates to me that this memory gain isn't visible to the application but the OS sees it
|
[17:01] pieterh
|
indeed
|
[17:02] pieterh
|
that is what it looks like, system resources (but owned by a process)
|
[17:02] cremes
|
right
|
[17:02] pieterh
|
what's weird is that it happens with inproc
|
[17:02] cremes
|
but then i would expect the OS to run out of that resource and start failing but it never does
|
[17:03] cremes
|
at least it doesn't before my 16GB box is completely exhausted and used up all of its swap!
|
[17:03] cremes
|
only then does the OS kill it
|
[17:03] pieterh
|
I suspect there is more than one symptom
|
[17:03] cremes
|
it doesn't make sense that inproc would show it
|
[17:03] Silly
|
this on linux? The default linux policy for the OS to give memory is to always succeed, then fail when it can't allocate the actual page
|
[17:03] Silly
|
iwth a segfault
|
[17:04] cremes
|
Silly: osx and linux, fails same way
|
[17:04] pieterh
|
cremes: that was the change I made, tcp->inproc
|
[17:04] sustrik
|
if it happens with inproc as well and the memory is locked in the OS
|
[17:04] Silly
|
cremes: huh
|
[17:04] sustrik
|
it's definitely the mailboxes assocaited with individual sockets
|
[17:04] pieterh
|
sustrik: did we have mailboxes in 2.0.10?
|
[17:04] sustrik
|
yeah
|
[17:04] pieterh
|
2.0.9?
|
[17:04] sustrik
|
yes
|
[17:05] mikko
|
pieterh: i got feature request for zdns
|
[17:05] pieterh
|
mikko: zns, it's zns... zdns are a bunch of splitters
|
[17:05] sustrik
|
in any case, the mailboxes for closed sockets should be closed by the reaper thread in 2.1.1
|
[17:05] pieterh
|
ah, how quickly does the reaper work?
|
[17:05] sustrik
|
if they are not, there's a bug somewhere
|
[17:06] sustrik
|
as fast as physically possible
|
[17:06] sustrik
|
no sleeps or similar
|
[17:06] pieterh
|
can we add an assertion that the number of mailboxes doesn't exceed some ceiling?
|
[17:06] pieterh
|
just for testing, not for real
|
[17:06] sustrik
|
let me see
|
[17:06] pieterh
|
mikko: what's the suggestion?
|
[17:07] mikko
|
pieterh: ephemeral entries
|
[17:07] pieterh
|
sounds good
|
[17:07] mikko
|
not sure if that is in, just saw the repo name in my feed
|
[17:07] pieterh
|
what does that mean?
|
[17:07] mikko
|
the entry exists only if the node who created it exists
|
[17:07] pieterh
|
ah, I put the zfl maintainers in as the default zns team
|
[17:07] sustrik
|
pieterh: see reaper.cpp:105
|
[17:07] cremes
|
sustrik: the code might be leaking mailboxes?
|
[17:08] sustrik
|
that's the only system resource held by socket when inproc is used
|
[17:09] pieterh
|
sustrik: ack, trying that
|
[17:09] pieterh
|
mikko: sounds good, worth noting, not sure how to make it
|
[17:09] mikko
|
pieterh: heartbeat maybe
|
[17:09] pieterh
|
sure
|
[17:09] mikko
|
pieterh: this is a concept from zookeeper
|
[17:10] pieterh
|
i wanted to build zns using the clone pattern from ch3
|
[17:10] pieterh
|
disrtributed key-value table
|
[17:10] mikko
|
pieterh: amazingly handy, as if a box fails it automatically disappears from the service discovery
|
[17:10] pieterh
|
of course
|
[17:10] pieterh
|
it's an excellent way to do failover
|
[17:14] pieterh
|
mikko: any other ideas for the name service?
|
[17:15] pieterh
|
i'll collect them on zero.mq/zns
|
[17:18] pieterh
|
sustrik: cremes: using inproc, it climbs to some point and then stops leaking memory
|
[17:18] cremes
|
that's interesting
|
[17:18] pieterh
|
I've added an assertion that the number of sockets (reaper.cpp) is under 100
|
[17:19] pieterh
|
that assertion never hits
|
[17:19] cremes
|
any hypothesis?
|
[17:19] mikko
|
pieterh: no, thats the single biggest thing
|
[17:19] mikko
|
i will have more soon probably
|
[17:19] pieterh
|
back to TCP/IPC leakage due to zombie sockets that go away either very slowly, or not at all
|
[17:19] pieterh
|
mikko: have started here: http://zns.zeromq.org/page:read-the-manual
|
[17:20] pieterh
|
cremes: will run the tcp test again now
|
[17:20] pieterh
|
inproc does not provoke the error, therefore is useless to play with
|
[17:21] pieterh
|
oh...
|
[17:21] pieterh
|
I *was* using tcp, had changed it...
|
[17:21] cremes
|
i added an assert to the if-block inside #process_reaped; that code inside there is never called
|
[17:21] cremes
|
so poller->rm_fd (mailbox_handle) never gets called
|
[17:21] pieterh
|
hmm, interesting
|
[17:23] pieterh
|
cremes, indeed, rm_fd is not being called
|
[17:23] sustrik
|
that looks like it
|
[17:23] cremes
|
yep, somehow the +terminating+ variable is set back to false!
|
[17:24] cremes
|
i don't see any code that would do that
|
[17:24] sustrik
|
where's the test code?
|
[17:24] pieterh
|
sustrik: see irc log just above
|
[17:25] pieterh
|
https://gist.github.com/841318
|
[17:25] sustrik
|
btw, the reason why the number of reaped socket is never over 100
|
[17:25] sustrik
|
is that the test is creating a new context periodically
|
[17:26] sustrik
|
ah, it's not
|
[17:26] sustrik
|
sorry
|
[17:26] pieterh
|
ah, I just got that assertion, right away
|
[17:26] pieterh
|
over inproc
|
[17:26] pieterh
|
does not hit over tcp
|
[17:27] pieterh
|
it doesn't call rm_fd using inproc either
|
[17:27] pieterh
|
if I don't put the assertion in, over inproc, my process runs out of file handles
|
[17:27] pieterh
|
it's the same error, different symptom
|
[17:27] pieterh
|
do you want that example as well?
|
[17:28] sustrik
|
ok, the code never gets inside "if (!sockets && terminating)"
|
[17:28] pieterh
|
cremes: congrats! you found a bug in rc1, I think!
|
[17:28] sustrik
|
because zmq_term() is never called
|
[17:28] sustrik
|
so it's ok
|
[17:28] pieterh
|
sustrik: there are two places it calls rm_fd, neither ever happen
|
[17:28] sustrik
|
it should not
|
[17:28] sustrik
|
unless you call zmq_term()
|
[17:29] pieterh
|
so when does it reap a socket?
|
[17:29] cremes
|
socket leak! :)
|
[17:29] sustrik
|
the socket reaps itself
|
[17:29] pieterh
|
where?
|
[17:30] pieterh
|
inproc doesn't use sockets at all, right?
|
[17:30] sustrik
|
TCP sockets?
|
[17:30] sustrik
|
no
|
[17:31] sustrik
|
it uses socketpairs though
|
[17:31] sustrik
|
1 socketpair per 0mq socket
|
[17:31] pieterh
|
so my test case uses inproc and dies after a few seconds with too many open files
|
[17:31] pieterh
|
yet in the code, sockets are properly closed
|
[17:31] sustrik
|
close is async
|
[17:32] pieterh
|
this is on a sub socket, nothing being written
|
[17:32] pieterh
|
how async is async?
|
[17:32] sustrik
|
utterly async
|
[17:32] pieterh
|
seconds? days?
|
[17:32] sustrik
|
microseconds
|
[17:32] pieterh
|
well, our loop here is in msec and sec
|
[17:32] pieterh
|
cremes: let me try init and term in the client loop...
|
[17:33] cremes
|
i was just about to try that
|
[17:33] pieterh
|
btw, HWM and LINGER are irrelevant to this afaics
|
[17:33] cremes
|
i thought so but i wanted to cover all of the bases
|
[17:34] pieterh
|
cremes: your publisher binds to an IP address btw
|
[17:34] pieterh
|
you should bind to *
|
[17:35] pieterh
|
ok, so doing init/term fixes the problem
|
[17:35] pieterh
|
and btw it calls the reaper all the time
|
[17:36] pieterh
|
sustrik: looks like socketpairs leak (at least)
|
[17:36] cremes
|
pieterh: binding to * doesn't always work on osx; i'll put together a test case & a ticket
|
[17:36] cremes
|
in C :)
|
[17:37] sustrik
|
so maybe the reaper doesn't work correctly
|
[17:37] pieterh
|
hmm, we need * to work, make a ticket... yes
|
[17:37] sustrik
|
maybe some sockets are stuck in the middle of the deallocation process
|
[17:37] cremes
|
sustrik: i think it *only* reaps when zmq_term() is called
|
[17:37] sustrik
|
let me try
|
[17:38] pieterh
|
cremes: that's what sustrik explained, it's the 'terminating' flag
|
[17:46] sustrik
|
bingo! reproduced!
|
[17:46] cremes
|
ah, i see why the JVM didn't croak when the heap expanded beyond its limit
|
[17:47] cremes
|
this C lib is on the other side of the JNI "bridge" so memory allocations performed there don't get charged to the JVM
|
[17:47] cremes
|
sustrik: yea!
|
[17:50] pieterh
|
it's going to be a great rc2
|
[17:51] cremes
|
pieterh: it was already a pretty good rc1; rc2 will just be *better*!
|
[17:52] pieterh
|
we need a new superlative generator
|
[17:52] cremes
|
that's a stupendous idea
|
[17:56] Evet
|
is there a function to store messages to disk?
|
[17:56] pieterh
|
Evet: not in 0MQ, nope
|
[17:56] pieterh
|
but in every language it's usually pretty easy
|
[17:58] Evet
|
im gonna implement redis like append-to-disk
|
[18:01] lt_schmidt_jr
|
and tehy will sen you a login e-mail
|
[18:01] lt_schmidt_jr
|
oops
|
[18:03] pieterh
|
lt_schmidt_jr: wrong windows, yeah, it happens :-)
|
[18:03] lt_schmidt_jr
|
and bad typing :)
|
[18:08] pieterh
|
sustrik: any joy?
|
[18:09] sustrik
|
debugging i
|
[18:09] sustrik
|
it
|
[18:09] pieterh
|
:-)
|
[18:31] cremes
|
sustrik: how hairy is this bug?
|
[18:34] sustrik
|
itnot that bad so far
|
[20:22] pieterh
|
sustrik: what happened?
|
[21:28] apexi200sx
|
when would something like zeromq be used over say an AMQP server like RabbitMQ, and vice versa
|
[21:33] apexi200sx
|
cheers will have a look
|
[21:34] apexi200sx
|
Fast[BDC]: well have you come from using a broker like rabbitmq ?
|
[21:38] apexi200sx
|
Fast[BDC]: why what have you implemented and y, sorry for being nosey
|
[21:44] apexi200sx
|
Fast[BDC]: I have heard of twisted, and perspective broker is its RPC mechanism is it not
|
[21:46] apexi200sx
|
Fast[BDC]: what sort of products, ? are you saying you gather price data from various sources over the internet of loads of different products that ur customer also sells, and u are providing the customer an automated way of price setting
|
[21:46] apexi200sx
|
?
|
[21:47] apexi200sx
|
Fast[BDC]: 3 million products, flippin heck , so you have to scan them frequently for changes
|
[21:48] cremes
|
you guys might be interested in this 0mq whitepaper: http://whaleshark.zeromq.org/
|
[21:50] apexi200sx
|
Fast[BDC]: how do you retrieve the datasets, is it web crawling ?
|
[21:50] apexi200sx
|
Fast[BDC]: HTML scraping ?
|
[21:57] apexi200sx
|
Fast[BDC]: just had a look at execnet, never heard of that before, sounds like it has a decent feature set
|
[21:58] apexi200sx
|
Fast[BDC]: where/what do you think ur bottle neck is
|
[22:11] apexi200sx
|
u running all this in the cloud (like amazon ?) and you need to be able to add new instances on the fly ?
|
[22:39] Evet
|
is there a publisher or subscriber limit for in-process?
|