Discussion:
kill backup
(too old to reply)
unknown
2006-08-30 07:57:38 UTC
Permalink
what is the safest way to kill a multi stripe backup i.e
dump database striped on multiple files on a device ( not a
tape backup) ?

I want to kill following dump db:
dump database tmpdb to "compress::/syb/tmp/tmpdb.01.dmp",
stripe on "compress::/syb/tmp/tmpdb.02.dmp",
stripe on "compress::/syb/tmp/tmpdb.03.dmp"

thx in advance.
A.M.
2006-09-01 02:51:06 UTC
Permalink
Post by unknown
what is the safest way to kill a multi stripe backup i.e
dump database striped on multiple files on a device ( not a
tape backup) ?
That shouldn't matter.
Post by unknown
dump database tmpdb to "compress::/syb/tmp/tmpdb.01.dmp",
stripe on "compress::/syb/tmp/tmpdb.02.dmp",
stripe on "compress::/syb/tmp/tmpdb.03.dmp"
What happens when you kill the spid?

-am © MMVI
unknown
2006-09-05 10:43:55 UTC
Permalink
hummm last time i replied it did not post. Trying again.

- On the production db of about 340Gb (data) the dump on 10
stripes when killed hung the server.

- When the sybmultibuf processes were killed on Unix ( they
had to be killed multiple times to make sure it gets killed
- used the same spid with gap of 5 sec. the spid was killed
after 4th try).

- Unable to delete one of the dump stripe file using the
"rm" command

I came across an example of sp_volchanged utility on sybase
website to abort a "File system"-(note not a tape dump)
dump. Tried on the test server it does not work.

All in all, I am hoping for a proc which will
- Gracefully Abort the dump db spid; should not hang the
server
- Gracefully Abort all the sybmultibuf processes related to
the session
- Remove all the (striped) dump files as the dump is being
aborted.

I know this can be done manually, but on production system
with the experience I had, one cannot affort to hang or
bounce a server just because a dump process spid is killed.

EJ
Post by A.M.
Post by unknown
what is the safest way to kill a multi stripe backup i.e
dump database striped on multiple files on a device (
not a tape backup) ?
That shouldn't matter.
Post by unknown
dump database tmpdb to "compress::/syb/tmp/tmpdb.01.dmp"
, stripe on "compress::/syb/tmp/tmpdb.02.dmp",
stripe on "compress::/syb/tmp/tmpdb.03.dmp"
What happens when you kill the spid?
-am © MMVI
Mikhail T.
2006-11-15 15:57:04 UTC
Permalink
Post by unknown
- On the production db of about 340Gb (data) the dump on 10
stripes when killed hung the server.
Just curious, how long does it take your server to dump such a thing
(compressed)? Ten stripes means 10 processes all running
zlib-compression...

What kind of system is this?

Thanks!

-mi
--
Sybase! Release the OpenClient's source -- under any license...
A.M.
2006-11-16 00:37:00 UTC
Permalink
Post by Mikhail T.
Post by unknown
- On the production db of about 340Gb (data) the dump on 10
stripes when killed hung the server.
Just curious, how long does it take your server to dump such a thing
(compressed)? Ten stripes means 10 processes all running
zlib-compression...
What kind of system is this?
At a guess, I'd say its doing roughly 34GB per stripe (give
or take a few GB). He's already mentioned he's using HP EVA
SAN and is using Unix commands like 'rm'. So its a big iron
Unix system like HP, IBM or Sun. Not sure how long it would
take. Depends on the i/o subsystem and where the bottlenecks
are. He's dumping to the one filesystem, so it should be
heavily striped. The bandwidth of the fiber to the SAN may
be a limiting factor since he's using the default compression
level. I'd be surprised if it took less than 2 hours.

-am © MMVI
Mikhail Teterin
2006-11-23 06:17:33 UTC
Permalink
So its a big iron Unix system like HP, IBM or Sun. [...]
Not sure how long it would take. Depends on the i/o subsystem and where
the bottlenecks are.
With any CPUs, but the recent Intels or AMDs, the bottleneck is going to be
with the processors (he is compressing).

Employing an assembler-optimized zlib on our old dual-Opteron 244 system,
for example, I can compress (level 9) at about 12Mb/s[1] (per processor),
which is easily saturated by a modern i/o subsystem.

Even with newer Opterons and at lower compression levels, I can't imagine
exceeding 40Mb/second per CPU.

However, to my knowledge, no "big iron" system has CPUs, which are anywhere
close to that in raw speed -- and assembler-optimizations don't exist for
them either :-(

For example, from my experiments just now, our very recent Sun box
(Sun-Fire-V440) with 1594MHz SPARCs can only gzip at 16.3Mb/second -- at
the lowest level (-1). At level 6 (the default), the rate falls to
11.1Mb/s, and at 9 -- to the pathetic 3.7Mb/s (3 times slower, than
Opteron-244 helped by assembler-coding).

EJ, can you share your stats, please? How long does the compressed dump take
you, and what compression level do you use?

Thanks!

-mi

[1] The compression rates depend strongly on the data being compressed -- my
stats are from compressing actual database dumps. Compressing already
compressed data, such as JPGs is going to take longer...
A.M.
2006-11-23 10:31:13 UTC
Permalink
Post by Mikhail Teterin
So its a big iron Unix system like HP, IBM or Sun. [...]
Not sure how long it would take. Depends on the i/o subsystem and where
the bottlenecks are.
With any CPUs, but the recent Intels or AMDs, the bottleneck is going to be
with the processors (he is compressing).
Employing an assembler-optimized zlib on our old dual-Opteron 244 system,
for example, I can compress (level 9) at about 12Mb/s[1] (per processor),
which is easily saturated by a modern i/o subsystem.
Even with newer Opterons and at lower compression levels, I can't imagine
exceeding 40Mb/second per CPU.
However, to my knowledge, no "big iron" system has CPUs, which are anywhere
close to that in raw speed -- and assembler-optimizations don't exist for
them either :-(
You are making a fundamental mistake in your assumptions by
comparing apples to oranges. All the big iron systems use
RISC CPUs which cannot be compared to the Intel CPUs based
on their clock speed.

Also, these SMP systems are also multiuser so you need to
take into account what else is happening throughout the
system. Its unlikely that the compressions will have the
system to themselves.
Post by Mikhail Teterin
For example, from my experiments just now, our very recent Sun box
(Sun-Fire-V440) with 1594MHz SPARCs can only gzip at 16.3Mb/second -- at
Unless it has these in it -
http://www.sun.com/processors/UltraSPARC-IVplus/index.xml
I wouldn't exactly call it new. The V440 is classed as
entry-level and has the UltraSPARC IIIi.
http://www.sun.com/servers/entry/v440/index.xml
Post by Mikhail Teterin
the lowest level (-1). At level 6 (the default), the rate falls to
11.1Mb/s, and at 9 -- to the pathetic 3.7Mb/s (3 times slower, than
Opteron-244 helped by assembler-coding).
Er ... all machine code is assember based. I'm assuming you
mean its hand coded - which I find laughable. Only on Wintel
do people still do this. You are also leaving out other
details from the comparison. What's the rest of the system
like? Memory? Disk? Number of CPUs? Load? 64bit code or 32bit?
Post by Mikhail Teterin
[1] The compression rates depend strongly on the data being compressed -- my
stats are from compressing actual database dumps. Compressing already
compressed data, such as JPGs is going to take longer...
Naturally.

-am © MMVI
Mikhail Teterin
2006-11-23 15:54:02 UTC
Permalink
Post by A.M.
Post by Mikhail Teterin
However, to my knowledge, no "big iron" system has CPUs, which are
anywhere close to that in raw speed -- and assembler-optimizations don't
exist for them either :-(
You are making a fundamental mistake in your assumptions by
comparing apples to oranges. All the big iron systems use
RISC CPUs which cannot be compared to the Intel CPUs based
on their clock speed.
Wrong. I'm not comparing "clock speed". I'm comparing actual throughput
of "gzip -9", "gzip", and "gzip -1" (and zlib's "minigzip" with the same
options). Sybase's libcompress-plugin uses zlib (and foolishly links it in
statically, BTW)... minigzip and gzip use the same alghorithms and produce
the same files.

Your loyalty is commendable, but sadly misplaced. You have to accept the
facts: today's RISC CPUs are _slower_ than Intels and AMDs. (The "big iron"
vendors have recognized this long ago -- Sun would not have gone through
the trouble of porting Solaris to AMD64, if it weren't for this
realization. Similarly, HP would not have bothered with "Merced".)
Post by A.M.
Also, these SMP systems are also multiuser so you need to
take into account what else is happening throughout the
system.
Using "time" (tcsh's built-in) and counting only the "utime" allows me to
ignore other users. Come on... You know this, and you should expect me to
know this too.

But no, the SPARC-system (the loser) was idle last night, when I did the
test. The Opteron-system (the winner) was busy compressing 3 incoming
backup streams, but time accounts for that...
Post by A.M.
Post by Mikhail Teterin
For example, from my experiments just now, our very recent Sun box
(Sun-Fire-V440) with 1594MHz SPARCs can only gzip at 16.3Mb/second -- at
I wouldn't exactly call it new.
We just bought it...
Post by A.M.
The V440 is classed as entry-level and has the UltraSPARC IIIi.
Oh, come one -- so the IVplus will be, gasp, 30% faster?.. :-)
Post by A.M.
Post by Mikhail Teterin
the lowest level (-1). At level 6 (the default), the rate falls to
11.1Mb/s, and at 9 -- to the pathetic 3.7Mb/s (3 times slower, than
Opteron-244 helped by assembler-coding).
Er ... all machine code is assember based. I'm assuming you
mean its hand coded - which I find laughable. Only on Wintel
do people still do this.
First, my box uses AMD's Opterons. Second, it runs FreeBSD -- that alone
makes your "only on Wintel" factually wrong :-)

More importantly, hand-coding "hot" functions may be laughable to you, but
it improved performance here by about 20%. Sun's C-compiler may be much
better than GNU's and thus there may, indeed, be no point in
hand-optimizing zlib for SPARCs. But that's irrelevant -- because gzip will
*still* be about 2 times faster on a 3 year-old Opteron system, than on a 3
months-old SPARC system.
Post by A.M.
You are also leaving out other details from the comparison. What's the
rest of the system like? Memory? Disk? Number of CPUs? Load? 64bit code or
32bit?
Memory, disk, number of CPUs, and load make no difference for a one-to-one
comparision, when "time" is used (and it always should be). As long as the
machine is not paging heavily, which it does not...

/usr/bin/gzip is 32-bit on Solaris and 64-bit on FreeBSD/amd64. You think, I
should build a 64-bit gzip from source and try again? How big of an
improvement do you expect? 10%?

[hold on]

Yes, indeed. Rebuilding gzip from source using the latest Sun compiler
(version 11) with ``-fast -xarch=v9b'' raises the throughput of "gzip -1"
from 16-17 to 17-18. I would seek more precision, if it mattered -- but it
does not, because Opteron's througput is about double that.

When compressing with -9, though, the 64-bit gzip _loses_ to the 32-bit
version on SPARC... Probably, because it is not hand-optimized.

For highest compression level (-9), the picture is even worse for Sparcs.
Compressing the same 524Mb piece of a Sybase "device" file takes 67.399u on
FreeBSD/amd64 (7.77Mb/sec) and 152.18u on Solaris/SPARC (3.44Mb/sec with
64-bit gzip, and 3.79Mb/s with the original 32-bit).

The "big iron" vendors have very stable OSes, good (fast I/O) chipsets, and
solid-built machines (mostly), but their CPUs are DOG FOOD -- and have been
for very long time. (Their user-space software tends to suck too, BTW, but
that's another topic).

Yours,

-mi
A.M.
2006-11-24 11:06:59 UTC
Permalink
Post by Mikhail Teterin
Post by A.M.
You are making a fundamental mistake in your assumptions by
comparing apples to oranges. All the big iron systems use
RISC CPUs which cannot be compared to the Intel CPUs based
on their clock speed.
Wrong. I'm not comparing "clock speed". I'm comparing actual throughput
of "gzip -9", "gzip", and "gzip -1" (and zlib's "minigzip" with the same
options). Sybase's libcompress-plugin uses zlib (and foolishly links it in
statically, BTW)... minigzip and gzip use the same alghorithms and produce
the same files.
In that case, the obvious thing to do is to compare the
exact same source code programs compiled for each hardware
/ OS version.
Post by Mikhail Teterin
Your loyalty is commendable, but sadly misplaced. You have to accept the
facts: today's RISC CPUs are _slower_ than Intels and AMDs. (The "big iron"
vendors have recognized this long ago -- Sun would not have gone through
the trouble of porting Solaris to AMD64, if it weren't for this
realization. Similarly, HP would not have bothered with "Merced".)
Sun bought Interactive Unix back in the early 90s. They
ported Solaris to x86 based on Interactive not long
after that. So your comment about 'gone through the
trouble' is specious. Its been no trouble for them to
port new versions of Solaris.

Note that AMD isn't Intel and the 64bit AMD chips don't
port around all the baggage that the Intel chips do.
Post by Mikhail Teterin
Post by A.M.
Also, these SMP systems are also multiuser so you need to
take into account what else is happening throughout the
system.
Using "time" (tcsh's built-in) and counting only the "utime" allows me to
ignore other users. Come on... You know this, and you should expect me to
know this too.
Sorry, I cannot make assumptions about the competency of
another person.
Post by Mikhail Teterin
But no, the SPARC-system (the loser) was idle last night, when I did the
test. The Opteron-system (the winner) was busy compressing 3 incoming
backup streams, but time accounts for that...
I notice you've still left out the other other details
about memory, number of CPUs, 64 bit or 32, and also
threads.
Post by Mikhail Teterin
Post by A.M.
I wouldn't exactly call it new.
We just bought it...
Irrelevant. I could have just bought an Apple II. I doubt
anyone would call it new.
Post by Mikhail Teterin
Post by A.M.
The V440 is classed as entry-level and has the UltraSPARC IIIi.
Oh, come one -- so the IVplus will be, gasp, 30% faster?.. :-)
And how much faster are the AMDs?
Post by Mikhail Teterin
Post by A.M.
Er ... all machine code is assember based. I'm assuming you
mean its hand coded - which I find laughable. Only on Wintel
do people still do this.
First, my box uses AMD's Opterons. Second, it runs FreeBSD -- that alone
makes your "only on Wintel" factually wrong :-)
Would you prefer it if I'd said Intel instead? Since I'm
sure there are some Linux nuts that also write assember.
Post by Mikhail Teterin
More importantly, hand-coding "hot" functions may be laughable to you, but
it improved performance here by about 20%. Sun's C-compiler may be much
better than GNU's and thus there may, indeed, be no point in
hand-optimizing zlib for SPARCs.
You'd also have to consider the GNU source code to begin
with. I'll assume from your comments about your experience
that you are familiar with all the Stallman jokes.
Post by Mikhail Teterin
But that's irrelevant -- because gzip will
*still* be about 2 times faster on a 3 year-old Opteron system, than on a 3
months-old SPARC system.
Only if all other factors are equal.
Post by Mikhail Teterin
Post by A.M.
You are also leaving out other details from the comparison. What's the
rest of the system like? Memory? Disk? Number of CPUs? Load? 64bit code or
32bit?
Memory, disk, number of CPUs, and load make no difference for a one-to-one
comparision, when "time" is used (and it always should be). As long as the
machine is not paging heavily, which it does not...
I don't consider this a one-to-one comparison. You should
know why.
Post by Mikhail Teterin
/usr/bin/gzip is 32-bit on Solaris and 64-bit on FreeBSD/amd64. You think, I
should build a 64-bit gzip from source and try again?
Yes.
Post by Mikhail Teterin
How big of an improvement do you expect? 10%?
Depends. If you want to compare 32bit and 64bit, you
should also make the OS 32bit - which you can't do on
the V440. So that issue is moot. But you can at least
compare 64bit UltraSPARC against 64bit AMD64.
Post by Mikhail Teterin
[hold on]
Yes, indeed. Rebuilding gzip from source using the latest Sun compiler
(version 11) with ``-fast -xarch=v9b'' raises the throughput of "gzip -1"
from 16-17 to 17-18. I would seek more precision, if it mattered -- but it
does not, because Opteron's througput is about double that.
Next try setting the compiler optimisation levels.
Post by Mikhail Teterin
When compressing with -9, though, the 64-bit gzip _loses_ to the 32-bit
version on SPARC... Probably, because it is not hand-optimized.
Its more likely to do with the architectural differences of
32bit and 64bit. The Sun newsgroups on Usenet (comp.unix.solaris
and comp.sys.sun.*) were full of these sorts of discussions.
Post by Mikhail Teterin
For highest compression level (-9), the picture is even worse for Sparcs.
Compressing the same 524Mb piece of a Sybase "device" file takes 67.399u on
FreeBSD/amd64 (7.77Mb/sec) and 152.18u on Solaris/SPARC (3.44Mb/sec with
64-bit gzip, and 3.79Mb/s with the original 32-bit).
I was going to ask if you were using the exact same file for
your compression tests each time. Next you need to run the
same OS on both hardware platforms.
Post by Mikhail Teterin
The "big iron" vendors have very stable OSes, good (fast I/O) chipsets, and
solid-built machines (mostly), but their CPUs are DOG FOOD -- and have been
for very long time. (Their user-space software tends to suck too, BTW, but
that's another topic).
I still disagree but, like you said, its another topic.

-am © MMVI
Mikhail Teterin
2006-11-24 14:40:23 UTC
Permalink
Post by A.M.
In that case, the obvious thing to do is to compare the
exact same source code programs compiled for each hardware
/ OS version.
I did. gzip on FreeBSD/amd64 was compared to gzip on Solaris/SPARC. I then
also measured minigzip on FreeBSD/amd64 and a self-compiled gzip64 on
Solaris/SPARC -- for a more complete picture. SPARC loses by a factor of 2.
Post by A.M.
Sorry, I cannot make assumptions about the competency of
another person.
Then err on the other side...
Post by A.M.
I notice you've still left out the other other details
about memory, number of CPUs, 64 bit or 32, and also
threads.
Because these are not relevant. When counting only the "utime", the amounts
of memory and number of CPUs may make no more than 1-2% difference. We are
talking about a gap of 100% or more...
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
I wouldn't exactly call it new.
We just bought it...
Irrelevant. I could have just bought an Apple II. I doubt
anyone would call it new.
Your Apple II would be a *used* one -- the manufacturer no longer sells
them. Unlike the V440s, which are among Sun's *current* offerings.
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
The V440 is classed as entry-level and has the UltraSPARC IIIi.
Oh, come one -- so the IVplus will be, gasp, 30% faster?.. :-)
And how much faster are the [newer -mi] AMDs?
By about the same. The point was, even if I compared newest SparcIVplus with
the old Opterons, the SPARC would've still lost -- by a big margin.
Post by A.M.
Post by Mikhail Teterin
First, my box uses AMD's Opterons. Second, it runs FreeBSD -- that alone
makes your "only on Wintel" factually wrong :-)
Would you prefer it if I'd said Intel instead? Since I'm
sure there are some Linux nuts that also write assember.
I would prefer it, if you stopped trying to label altogether. (I have no
idea, where "Linux" came from, BTW...)

Anyway, they may be "nuts", but they do it -- my point was, their assembler
code is available for Opterons, but not for Sparcs, which further
contributes (unfairly) to the already substantial gap between the CPU
architectures.
Post by A.M.
You'd also have to consider the GNU source code to begin with.
Whatever its merits, it runs twice faster on Opteron-244 than on SPARC-IIIi.
And, /most importantly/ , it is what Sybase uses for compressing the
dumps...
Post by A.M.
Post by Mikhail Teterin
But that's irrelevant -- because gzip will *still* be about 2 times
faster on a 3 year-old Opteron system, than on a 3 months-old SPARC
system.
Only if all other factors are equal.
Your questioning of my benchmarking is most prudent. Why don't you try your
own comparision? gzip's source code is widely available as is the hardware
in question -- I suspect, you can login to a SPARC system and an Opteron
(or Intel) system from the same desktop, from which you are reading this
message...
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
You are also leaving out other details from the comparison. What's the
rest of the system like? Memory? Disk? Number of CPUs? Load? 64bit code
or 32bit?
Memory, disk, number of CPUs, and load make no difference for a
one-to-one comparision, when "time" is used (and it always should be). As
long as the machine is not paging heavily, which it does not...
I don't consider this a one-to-one comparison. You should
know why.
Everything I know, says these items are irrelevant. But if you insist, here:

SPARC (Sun's V440): Opteron (hand-built from parts):
16Gb of RAM 1Gb of RAM
4 SPARC-IIIi @1533MHz 2 Opteron-244 @1800MHz
local fibre disks (15K RPM) local SATA disks (7200 RPM)
0 load before timing load of 3 before timing

Do you still feel, the SPARC was unfairly handicapped?
Post by A.M.
Next try setting the compiler optimisation levels.
I used `-fast -xarch=v9b'. The `-fast' means maximum optimization and it
implies `-native', which tells the compiler to optimize for the host
machine. It is possible, that marginally better results could be obtained
by using slightly lower optimization (-xO4 is, sometimes, better
than -xO5), but, frankly, I'm not going to bother...
Post by A.M.
Post by Mikhail Teterin
For highest compression level (-9), the picture is even worse for Sparcs.
Compressing the same 524Mb piece of a Sybase "device" file takes 67.399u
on FreeBSD/amd64 (7.77Mb/sec) and 152.18u on Solaris/SPARC (3.44Mb/sec
with 64-bit gzip, and 3.79Mb/s with the original 32-bit).
I was going to ask if you were using the exact same file for
your compression tests each time. Next you need to run the
same OS on both hardware platforms.
Are you saying, that using FreeBSD may make a machine so much faster, than
Solaris? :-) I am a FreeBSD fan, but I would not expect it to make a
difference of over a few percents...
Post by A.M.
Post by Mikhail Teterin
[...] but their CPUs are DOG FOOD [...]
I still disagree [...]
Name a CPU-intensive serial task, that completes faster on a SPARC than on
an Opteron system...

-mi
A.M.
2006-11-29 04:32:14 UTC
Permalink
Post by Mikhail Teterin
Post by A.M.
In that case, the obvious thing to do is to compare the
exact same source code programs compiled for each hardware
/ OS version.
I did. gzip on FreeBSD/amd64 was compared to gzip on Solaris/SPARC. I then
also measured minigzip on FreeBSD/amd64 and a self-compiled gzip64 on
Solaris/SPARC -- for a more complete picture. SPARC loses by a factor of 2.
OK, I'll assume its the same version of gzip too.
Post by Mikhail Teterin
Post by A.M.
Sorry, I cannot make assumptions about the competency of
another person.
Then err on the other side...
I always start from the bottom.
Post by Mikhail Teterin
Post by A.M.
I notice you've still left out the other other details
about memory, number of CPUs, 64 bit or 32, and also
threads.
Because these are not relevant. When counting only the "utime", the amounts
of memory and number of CPUs may make no more than 1-2% difference. We are
talking about a gap of 100% or more...
Yes, but they are factors that need to be considered.
Ideally, you should either bid the process to a CPU or
shut the other ones off. I suspect utime includes the
CPU cache swapping. Code size of 64bit or 32bit makes
a big difference - as you already discovered.
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
I wouldn't exactly call it new.
We just bought it...
Irrelevant. I could have just bought an Apple II. I doubt
anyone would call it new.
Your Apple II would be a *used* one -- the manufacturer no longer sells
them. Unlike the V440s, which are among Sun's *current* offerings.
But not the leading edge offerings.
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
The V440 is classed as entry-level and has the UltraSPARC IIIi.
Oh, come one -- so the IVplus will be, gasp, 30% faster?.. :-)
And how much faster are the [newer -mi] AMDs?
By about the same. The point was, even if I compared newest SparcIVplus with
the old Opterons, the SPARC would've still lost -- by a big margin.
Yes, but we're trying to determine the true performance difference.
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
First, my box uses AMD's Opterons. Second, it runs FreeBSD -- that alone
makes your "only on Wintel" factually wrong :-)
Would you prefer it if I'd said Intel instead? Since I'm
sure there are some Linux nuts that also write assember.
I would prefer it, if you stopped trying to label altogether. (I have no
idea, where "Linux" came from, BTW...)
Because it runs on Intels as well.
Post by Mikhail Teterin
Anyway, they may be "nuts", but they do it -- my point was, their assembler
code is available for Opterons, but not for Sparcs, which further
contributes (unfairly) to the already substantial gap between the CPU
architectures.
You can assemble on Sparc as well. Look at the compiler
options closely. But my point is, nobody does because
its a waste of time. Apart from bootstarp and device
drivers, there's no real assembly programming done.
Post by Mikhail Teterin
Post by A.M.
You'd also have to consider the GNU source code to begin with.
Whatever its merits, it runs twice faster on Opteron-244 than on SPARC-IIIi.
And, /most importantly/ , it is what Sybase uses for compressing the
dumps...
And this was based on your tests using one "smallish" file.
How much time did it take all up and how much compression was
there? Now, was it really worth it?
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
But that's irrelevant -- because gzip will *still* be about 2 times
faster on a 3 year-old Opteron system, than on a 3 months-old SPARC
system.
Only if all other factors are equal.
Your questioning of my benchmarking is most prudent. Why don't you try your
own comparision? gzip's source code is widely available as is the hardware
in question -- I suspect, you can login to a SPARC system and an Opteron
(or Intel) system from the same desktop, from which you are reading this
message...
I don't have access to Opteron systems.
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
Post by A.M.
You are also leaving out other details from the comparison. What's the
rest of the system like? Memory? Disk? Number of CPUs? Load? 64bit code
or 32bit?
Memory, disk, number of CPUs, and load make no difference for a
one-to-one comparision, when "time" is used (and it always should be). As
long as the machine is not paging heavily, which it does not...
I don't consider this a one-to-one comparison. You should
know why.
16Gb of RAM 1Gb of RAM
local fibre disks (15K RPM) local SATA disks (7200 RPM)
0 load before timing load of 3 before timing
Do you still feel, the SPARC was unfairly handicapped?
OK, what is the size of their CPU caches? What are the times
when bound to one CPU?
Post by Mikhail Teterin
Post by A.M.
Next try setting the compiler optimisation levels.
I used `-fast -xarch=v9b'. The `-fast' means maximum optimization and it
implies `-native', which tells the compiler to optimize for the host
machine. It is possible, that marginally better results could be obtained
by using slightly lower optimization (-xO4 is, sometimes, better
than -xO5), but, frankly, I'm not going to bother...
OK, I just wanted to check that you were setting xarch after
fast. What about "-xdepend", "-xvector", "-xunroll" etc.?
Post by Mikhail Teterin
Post by A.M.
I was going to ask if you were using the exact same file for
your compression tests each time. Next you need to run the
same OS on both hardware platforms.
Are you saying, that using FreeBSD may make a machine so much faster, than
Solaris? :-) I am a FreeBSD fan, but I would not expect it to make a
difference of over a few percents...
No, using the same OS means the only difference is the CPU and
hardware. Its an issue of OS libraries, OS process control, etc.
Post by Mikhail Teterin
Post by A.M.
Post by Mikhail Teterin
[...] but their CPUs are DOG FOOD [...]
I still disagree [...]
Name a CPU-intensive serial task, that completes faster on a SPARC than on
an Opteron system...
That's not a legitimate question and you know it. Try adding
more CPU and memory to your Opteron system and see how well
it scales.

For more comments, feel free to read these threads -

http://groups.google.com/group/comp.unix.solaris/browse_frm/thread/45ce314c491253ed/23213331b39bf9f1?lnk=st&q=&rnum=1#23213331b39bf9f1
http://groups.google.com/group/comp.unix.solaris/browse_frm/thread/fbc95fa461b2c175/c77e967b7041f328?lnk=st&q=&rnum=5#c77e967b7041f328

-am © MMVI
Mikhail T.
2006-12-05 18:51:39 UTC
Permalink
Post by A.M.
Post by Mikhail Teterin
By about the same. The point was, even if I compared newest SparcIVplus
with the old Opterons, the SPARC would've still lost -- by a big margin.
Yes, but we're trying to determine the true performance difference.
No, that's not what I was trying to do. I asked EJ (the initiator of this
thread) to post his timings, and then we both speculated, what these could
be.

The exact performance difference is hard to measure. The ballpark estimate
is easy. And it is over 2 times -- even if ALL the factors you mentioned
elsewhere work in SPARC's favor... That's it...
Post by A.M.
I don't have access to Opteron systems.
Intel currently "holds the crown" -- so run your estimates on a (recent) PC
vs. Sparc. Even by the real-time clock on PC vs. utime on Sun, the PC will
win. By far... By how much *exactly* is of much less interest...

-mi
--
Sybase! Release the OpenClient's source -- under any license...
unknown
2006-09-05 10:45:08 UTC
Permalink
.. trying again

hummm last time i replied it did not post. Trying again.

- On the production db of about 340Gb (data) the dump on 10
stripes when killed hung the server.

- When the sybmultibuf processes were killed on Unix ( they
had to be killed multiple times to make sure it gets killed
- used the same spid with gap of 5 sec. the spid was killed
after 4th try).

- Unable to delete one of the dump stripe file using the
"rm" command

I came across an example of sp_volchanged utility on sybase
website to abort a "File system"-(note not a tape dump)
dump. Tried on the test server it does not work.

All in all, I am hoping for a proc which will
- Gracefully Abort the dump db spid; should not hang the
server
- Gracefully Abort all the sybmultibuf processes related to
the session
- Remove all the (striped) dump files as the dump is being
aborted.

I know this can be done manually, but on production system
with the experience I had, one cannot affort to hang or
bounce a server just because a dump process spid is killed.

EJ
Post by A.M.
Post by unknown
what is the safest way to kill a multi stripe backup i.e
dump database striped on multiple files on a device (
not a tape backup) ?
That shouldn't matter.
Post by unknown
dump database tmpdb to "compress::/syb/tmp/tmpdb.01.dmp"
, stripe on "compress::/syb/tmp/tmpdb.02.dmp",
stripe on "compress::/syb/tmp/tmpdb.03.dmp"
What happens when you kill the spid?
-am © MMVI
A.M.
2006-09-06 00:27:08 UTC
Permalink
Post by unknown
.. trying again
hummm last time i replied it did not post. Trying again.
The newsserver seems a bit slow now and again. Give it
some time.
Post by unknown
- On the production db of about 340Gb (data) the dump on 10
stripes when killed hung the server.
What ASE version are you using? Check for later EBFs
and read the cover letters for any updates to the
backup server.
Post by unknown
- When the sybmultibuf processes were killed on Unix ( they
had to be killed multiple times to make sure it gets killed
- used the same spid with gap of 5 sec. the spid was killed
after 4th try).
- Unable to delete one of the dump stripe file using the
"rm" command
What error message were you getting for this? This
sounds odd and its more likely that the file was
open and still being written to. If that's the case,
removing the file leaves the link open and it shows
up again when the next buffer is flushed.
Post by unknown
I came across an example of sp_volchanged utility on sybase
website to abort a "File system"-(note not a tape dump)
dump. Tried on the test server it does not work.
You mean you used the ABORT option to sp_volchanged?
What error did you get or was there a message in the
backup server's log?
Post by unknown
All in all, I am hoping for a proc which will
- Gracefully Abort the dump db spid; should not hang the
server
- Gracefully Abort all the sybmultibuf processes related to
the session
- Remove all the (striped) dump files as the dump is being
aborted.
Removing the files will still need to be done
manually. The others may require an EBF.
Post by unknown
I know this can be done manually, but on production system
with the experience I had, one cannot affort to hang or
bounce a server just because a dump process spid is killed.
I should really ask why are you killing the dump
process rather than letting it complete.

-am © MMVI
unknown
2006-09-07 12:23:56 UTC
Permalink
Post by A.M.
Post by unknown
.. trying again
hummm last time i replied it did not post. Trying again.
The newsserver seems a bit slow now and again. Give it
some time.
thanks - got it.
Post by A.M.
Post by unknown
- On the production db of about 340Gb (data) the dump on
10 stripes when killed hung the server.
What ASE version are you using? Check for later EBFs
and read the cover letters for any updates to the
backup server.
Post by unknown
- When the sybmultibuf processes were killed on Unix (
they had to be killed multiple times to make sure it
gets killed - used the same spid with gap of 5 sec. the
spid was killed after 4th try).
- Unable to delete one of the dump stripe file using the
"rm" command
What error message were you getting for this? This
sounds odd and its more likely that the file was
open and still being written to. If that's the case,
removing the file leaves the link open and it shows
up again when the next buffer is flushed.
"rm" silently completed. No errors. Retried with "rm -i" it
asked for confirmation but still didnt delete. Verified
there was no alias for rm.
Post by A.M.
Post by unknown
I came across an example of sp_volchanged utility on
sybase website to abort a "File system"-(note not a tape
dump) dump. Tried on the test server it does not work.
You mean you used the ABORT option to sp_volchanged?
What error did you get or was there a message in the
backup server's log?
Yes i did mean ABORT option of sp_volchanged. On sybase
website I could find an example in one of the solved cases
where it indicated to use ABORT of sp_volchange for dumps on
filesystem.
Same command on my test server completed successfully: (0
rows affected) (return status = 0). No errors in the
backupserver error log - dump continues as normal...
Post by A.M.
Post by unknown
All in all, I am hoping for a proc which will
- Gracefully Abort the dump db spid; should not hang the
server
- Gracefully Abort all the sybmultibuf processes related
to the session
- Remove all the (striped) dump files as the dump is
being aborted.
Removing the files will still need to be done
manually. The others may require an EBF.
Post by unknown
I know this can be done manually, but on production
system with the experience I had, one cannot affort to
hang or bounce a server just because a dump process spid
is killed.
I should really ask why are you killing the dump
process rather than letting it complete.
Well we have a typical sybase setup. Single server multiple
DB's. When dump on this huge db is running it slows down the
server and all the processes including process for other db.
There are critical jobs running on other db's. At times they
happen to run in parallel with dump which slows down these
jobs with a domino affect on rest of our
system/application..
We have separate disk for this huge db, critical jobs are
bind to high priority execution class and so on...

Now i cant describe here the importance and details about
what goes wrong when the critical jobs are delayed....

Sometimes killing a dump spid sounds logical than having it
run for few hours to completion.
Post by A.M.
-am © MMVI
A.M.
2006-09-07 20:30:42 UTC
Permalink
Post by unknown
Post by A.M.
Post by unknown
- Unable to delete one of the dump stripe file using the
"rm" command
What error message were you getting for this? This
sounds odd and its more likely that the file was
open and still being written to. If that's the case,
removing the file leaves the link open and it shows
up again when the next buffer is flushed.
"rm" silently completed. No errors. Retried with "rm -i" it
asked for confirmation but still didnt delete. Verified
there was no alias for rm.
What did the file look like? Did you do a long listing?
Was it growing?
Post by unknown
Post by A.M.
You mean you used the ABORT option to sp_volchanged?
What error did you get or was there a message in the
backup server's log?
Yes i did mean ABORT option of sp_volchanged. On sybase
website I could find an example in one of the solved cases
where it indicated to use ABORT of sp_volchange for dumps on
filesystem.
Same command on my test server completed successfully: (0
rows affected) (return status = 0). No errors in the
backupserver error log - dump continues as normal...
Ah, but were you promted to run sp_volchanged at the
time?
Post by unknown
Post by A.M.
I should really ask why are you killing the dump
process rather than letting it complete.
Well we have a typical sybase setup. Single server multiple
DB's. When dump on this huge db is running it slows down the
server and all the processes including process for other db.
OK sounds reasonable.
Post by unknown
There are critical jobs running on other db's. At times they
happen to run in parallel with dump which slows down these
jobs with a domino affect on rest of our
system/application..
We have separate disk for this huge db, critical jobs are
bind to high priority execution class and so on...
When you say separate disk do you mean that the
devices for that db are totally isolated from the
others or just a separate disk within the same
disk farm? What about the dump device? Is that
separate and isolated as well or within a disk
grouping shared by the other devices?
Post by unknown
Now i cant describe here the importance and details about
what goes wrong when the critical jobs are delayed....
And is there a timeframe when you can dump while
the other critical jobs are not running?
Post by unknown
Sometimes killing a dump spid sounds logical than having it
run for few hours to completion.
If there's no other alternatives, you could try
quiescing the database and dumping the devices
at the OS level. This may also impact you if the
disk devices are shared in some way.

Another approach is to slow it down. Dump to tape
(a slow one), reduce the number of dump buffers,
lower the priority of the dump process, etc.

But that may not be really practical if it ends
up taking too long.

-am © MMVI
unknown
2006-09-08 14:31:00 UTC
Permalink
What did the file look like? Did you do a long listing?
Was it growing?

hummm... no did not do it or even if it was done, dont
remember file bytes changing. This was missed i guess.



Ah, but were you promted to run sp_volchanged at the
time?

Nope, wasnt prompted. Hence the question - is there any safe
way to abort a backup (my search in sybase solved cases
displayed sp_volchanged example; was hoping it might just
work or someone who has used it in similar situation would
share his experience)



When you say separate disk do you mean that the
devices for that db are totally isolated from the
others or just a separate disk within the same
disk farm? What about the dump device? Is that
separate and isolated as well or within a disk
grouping shared by the other devices?

The 340Gb db has its devices on HP EVA 6000 SAN. The dump
device for this db is also on HP EVA 6000 SAN (using
seperate vg's-volume groups). The rest of the db's sit on HP
EVA 4000 SAN



And is there a timeframe when you can dump while
the other critical jobs are not running?

The point being, critical jobs complete in < 45 mins; These
jobs trasfer the required data from ODS to warehouse. Hence
killing a dump in our situation would help complete these
jobs - the *critical users already have the needed data for
early morning processing. Sometimes(not often) dumps running
late (slowing down the system) would be acceptable by the
ODS group, but not the warehouse group.
Post by A.M.
Post by unknown
Post by A.M.
Post by unknown
- Unable to delete one of the dump stripe file using
the "rm" command
What error message were you getting for this? This
sounds odd and its more likely that the file was
open and still being written to. If that's the
case, removing the file leaves the link open and
it shows up again when the next buffer is flushed.
"rm" silently completed. No errors. Retried with "rm -i"
it asked for confirmation but still didnt delete.
Verified there was no alias for rm.
What did the file look like? Did you do a long
listing?
Was it growing?
Post by unknown
Post by A.M.
You mean you used the ABORT option to
sp_volchanged? What error did you get or was there
a message in the backup server's log?
Yes i did mean ABORT option of sp_volchanged. On sybase
website I could find an example in one of the solved
cases where it indicated to use ABORT of sp_volchange
for dumps on filesystem.
(0 rows affected) (return status = 0). No errors in
the
Post by A.M.
Post by unknown
backupserver error log - dump continues as normal...
Ah, but were you promted to run sp_volchanged at the
time?
Post by unknown
Post by A.M.
I should really ask why are you killing the dump
process rather than letting it complete.
Well we have a typical sybase setup. Single server
multiple DB's. When dump on this huge db is running it
slows down the server and all the processes including
process for other db.
OK sounds reasonable.
Post by unknown
There are critical jobs running on other db's. At times
they happen to run in parallel with dump which slows
down these jobs with a domino affect on rest of our
system/application..
We have separate disk for this huge db, critical jobs
are bind to high priority execution class and so on...
When you say separate disk do you mean that the
devices for that db are totally isolated from the
others or just a separate disk within the same
disk farm? What about the dump device? Is that
separate and isolated as well or within a disk
grouping shared by the other devices?
Post by unknown
Now i cant describe here the importance and details
about what goes wrong when the critical jobs are
delayed....
And is there a timeframe when you can dump while
the other critical jobs are not running?
Post by unknown
Sometimes killing a dump spid sounds logical than having
it run for few hours to completion.
If there's no other alternatives, you could try
quiescing the database and dumping the devices
at the OS level. This may also impact you if the
disk devices are shared in some way.
Another approach is to slow it down. Dump to tape
(a slow one), reduce the number of dump buffers,
lower the priority of the dump process, etc.
But that may not be really practical if it ends
up taking too long.
-am © MMVI
A.M.
2006-09-08 20:30:06 UTC
Permalink
Post by A.M.
What did the file look like? Did you do a long listing?
Was it growing?
hummm... no did not do it or even if it was done, dont
remember file bytes changing. This was missed i guess.
OK, never mind.
Post by A.M.
Ah, but were you promted to run sp_volchanged at the
time?
Nope, wasnt prompted. Hence the question - is there any safe
way to abort a backup (my search in sybase solved cases
displayed sp_volchanged example; was hoping it might just
work or someone who has used it in similar situation would
share his experience)
sp_volchanged is only suitable for when you're prompted
to run it. Issuing it on its own won't do much good and
you're unlikely to be prompted for a backup going to a
filesystem (unless the filesystem runs out of space).
Post by A.M.
When you say separate disk do you mean that the
devices for that db are totally isolated from the
others or just a separate disk within the same
disk farm? What about the dump device? Is that
separate and isolated as well or within a disk
grouping shared by the other devices?
The 340Gb db has its devices on HP EVA 6000 SAN. The dump
device for this db is also on HP EVA 6000 SAN (using
seperate vg's-volume groups). The rest of the db's sit on HP
EVA 4000 SAN
OK, so there's no contention between database devices
but there's potentially a problem on the 6000 SAN.
Separate volumes doesn't imply separate disks. You'd
need to talk to your storage admin about that.

However, that isn't really pertinent now. The main thing
is that the databases shouldn't be experiencing disk
contention so the perform impact you're seeing is due
to internal ASE tasks and memory handling. Do the
dbs you have use separate named caches? If not, the
dump of the large one could be swamping data cache
and thus affecting the others. Either put the large one
in its own bound cache or do the same to the others
(the former is the easiest).
Post by A.M.
And is there a timeframe when you can dump while
the other critical jobs are not running?
The point being, critical jobs complete in < 45 mins; These
jobs trasfer the required data from ODS to warehouse. Hence
killing a dump in our situation would help complete these
jobs - the *critical users already have the needed data for
early morning processing. Sometimes(not often) dumps running
late (slowing down the system) would be acceptable by the
ODS group, but not the warehouse group.
Can the dumps be run after the processing jobs are
run?

Alternately, can you split up the databases into
separate server instances? If they need to be linked
you can look at using proxy tables/databases or
remote connections.

-am © MMVI
Loading...