GoKu
Feb 23 2004, 12:25 PM
Ok timings would be something like 7,3,3,2.5
Good timings are: 2,6,2,2; 2,5,2,2; 2,11,2,2
Depending on systems, timings vary which work best.
Now, it also depends if you want to keep your CPU overclocked on what you should get. You could get 3,200 with good timings, run @ 200 fsb, and lower the multiplyer and you wouldn't have any O/C at all. You could leave the multiplyer alone, and use 200 fsb and be overclocked. It's personal prefrence. Your BEST bet is to get 200 FSB and run at a lower clock speed. Memory bandwith seems to affect performance more than CPU speed now. Timings are crucial too, with good FSB and slax timings it's pointless. You're better with lower and tighter than higher and looser.
I'm guessing you have a 333 FSB chip, so if you want to run everything stock I'd get some good PC 2,700. Stick with brand names as Bull said. Corsair XMS PC 2,700LL (low latency) is good for that. Or almost any PC 3,200 I would suspect could run @ 166 w/ tight timings.
GoKu
Feb 23 2004, 03:40 PM
Memory timings
Memory performance is not entirely determined by bandwidth, but also the speeds at which it responds to a command or the times it must wait before it can start or finish the processes of reading or writing data. These are memory latencies or reaction times (timings). Memory timings control the way your memory is accessed and can be either a contributing factor to better or worse 'real-world' performance of your system.
Internally DRAM has a huge array of cells that contain data. (If you've ever used Microsoft's Excel, try and picture it that way) A pair of row and column addresses can uniquely address each cell in the DRAM. DRAM communicates with a memory controller through two main groups of signals: Control-Address signals and Data signals. These signals are sent to the RAM in order for it to read/write data, address and control. The address is of course where the data is located on the memory banks, and the control signals are various commands needed to read or write. There are delays before a control signal can be executed or finish and this is where we get memory timings. The standard format for memory timings are most often expressed as a string of four numbers, separated by dashes, from left to right or vice-versa like this 2-2-2-5 [CAS-tRP-tRCD-tRAS] . These values represent how many clock cycles long each delay is but are not expressed in the order in which they occur. Different bioses will display them differently and there maybe additional options (timings) available.
Which timings mean what?
In most motherboards, numerous settings can be found to optimize your memory. These settings are often found the Advanced Chipset section of the popular award bioses. In certain instances, the settings maybe placed in odd locations, so please consult your motherboard manual for specific information. Below are common latency options:
Command rate - is the delay (in clock cycles) between when chip select is asserted (i.e. the RAM is selected) and commands (i.e. Activate Row) can be issued to the RAM. Typical values are 1T (one clock cycle) and 2T (two clock cycles).
CAS (Column Address Strobe or Column Address Select) - is the number of clock cycles (or Ticks, denoted with T) between the issuance of the READ command and when the data arrives at the data bus. Memory can be visualized as a table of cell locations and the CAS delay is invoked every time the column changes, which is more often than row changing.
tRP (RAS Precharge Delay) - is the speed or length of time that it takes DRAM to terminate one row access and start another. In simpler terms, it means switching memory banks.
tRCD (RAS (Row Access Strobe) to CAS delay) - As it says it's the time between RAS and CAS access, ie. the delay between when a memory bank is activated to when a read/write command is sent to that bank. Picture an Excel spreadsheet with a number across the top and along the left side. They numbers down the left side represent the Rows and the numbers across the top represent the Columns. The time it would take you, for example, to move down to Row 20 and across to Column 20 is RAS to CAS.
tRAS (Active to Precharge or Active Precharge Delay) - controls the length of the delay between the activation and precharge commands ---- basically how long after activation can the access cycle be started again. This influences row activation time which is taken into account when memory has hit the last column in a specific row, or when an entirely different memory location is requested.
These timings or delays occur in a particular order. When a Row of memory is activated to be read by the memory controller, there is a delay before the data on that Row is ready to be accessed, this is known as tRCD (RAS to CAS, or Row Address Strobe to Column Access Strobe delay). Once the contents of the row have been activated, a read command is sent, again by the memory controller, and the delay before it starts actually reading is the CAS (Column Access Strobe) latency. When reading is complete, the Row of data must be de-activated, which requires another delay, known as tRP (RAS Precharge), before another Row can be activated. The final value is tRAS, which occurs whenever the controller has to address different rows in a RAM chip. Once a row is activated, it cannot be de-activated until the delay of tRAS is over.
To tweak or not to tweak?
In order to really maximize performance from your memory, you'll need to gain access to your system's bios. There is usually a Master Memory setting, often rightly called Memory Timing or Interface, which gives usually gives you the choice to set your memory timings by SPD or Auto, preset Optimal and Aggressive timings (e.g. turbo and ultra), and lastly an Expert or Manual setting that will enable you to manipulate individual memory timing settings to your liking.
Are the gains of the perfect, hand-tweaked memory timing settings worth it over the automatic settings? If you're just looking to run at stock speeds and want absolute stability, then the answer to that question would probably be no. The relevance would be nominal at best and you would be better off going by SPD or Auto. However, if your setup is up on the cutting edge of technology or you’re pushing performance to the limit as do some overclockers, or gamers or tweakers, it may have great relevance.
SPD (Serial Presence Detect)
SPD is a feature available on all DDR modules. This feature solves compatibility problems by making it easier for the BIOS to properly configure the system to optimize your memory. The SPD device is an EEPROM (Electrically Erasable Programmable Read Only Memory) chip, located on the memory module itself that stores information about the DIMM modules' size, timings, speed, data width, voltage, and other parameters. If you configure your memory by SPD, the bios will read those parameters during the POST routine (bootup) and will automatically adjust values in the BIOS according preset module manufacturer specifications.
There is one caveat though. At times the SPD contents are not read correctly by the bios. With certain combinations of motherboard, bios, and memory setting SPD or Auto may result in the bios selecting full-fast timings (lowest possible numbers), or at times full-slow timings (highest possible numbers). This is often the culprit in situations where it appears that a particular memory module is not compatible with a given board. Often in these cases the SPD contents are not being read correctly and the bios is using faster memory timings than the module or system as a whole can boot with. In cases like these try replacing the module with another, setting the bios to allow manual timings, and setting those timings to safer (higher) values will allow the combination to work.
Ok so I want to tweak, what do I do?
Now for the cool stuff!!!
The first order of business, when tweaking your memory, is to deactivate the automatic RAM configuration -- SPD or Auto. With SPD enabled, the SPD chip on the memory module is read to obtain information about the timings, voltage and clock speed and those settings are adjusted accordingly. These settings are, however, very conservative to ensure stable operation on as many systems as possible. With a manual configuration, you can customize these settings for your own system and in most cases, the memory modules will remain stable even when they exceed the manufacturer's specifications.
As a general rule, a lower number (or timing) will result in improved performance. After all, if it takes fewer cycles to complete an operation, then it can fit more operations within X amount of time. However, this comes at a cost, and that is stability. It is similar to wireless networking with short and long preambles. A long pre-amble might be slower, but in a heavy network environment it is much more reliable than short preamble because there is more certainty a packet is for your NIC. The same is for memory - the more cycles used, in general, the more stable the performance. This is inherently true for all of them because to access precisely the right part of the memory, you have to be accurate, and the more time to do a calculation will make it more accurate in this instance. Most typical values are 2 and 3. You might ask: Why can't we use 1 or even 0 values for memory timings? JEDEC specifies that it's not possible for current DRAM technology to operate as it should under such conditions. Depending on motherboard, you might be able to squeeze '1' on certain timings, but will very likely result memory errors and instability. And even if it doesn't, it is unlikely to result in a performance gain.
If you are not planning on overclocking the clock speed of your RAM or if you have fast RAM rated at speeds above that of your current FSB, it may be possible to just lower the timings for a performance gain in certain applications that require most frequent accesses to system memory like, for instance, games. Memory timings can vary depending on the performance of RAM chips used. Not all memory modules will exhibit the ability to use certain timings without producing errors. So testing, trial and error, is required.
Here are general guidelines to follow while "tweaking":
As with CPU/video card overclocking, adjusting the memory timings should be done methodically and with ample time to test each adjustment.
lower figures = better performance, but lower overclockability and possibly diminished stability.
higher figures = lesser performance, but increased overclockability and more stability -- to an extent
tRCD & tRP are usually equal numbers between 2 and 4. In tweaking for more overclockability, lower tRP first between these two
CAS should be either 2.0 or 2.5. Many systems, most nforce2, fail to boot with a 3.0 setting or have stability problems. CAS is not most critical of the various timings, unlike what is taught by many. In general, the importance of CAS when placed against tRP and tRCD is nominal. Reducing CAS has a relatively minor effect on memory performance, while lower tRP & tRCD values result in a much more substantial gain. In other words if you had to choose, 3-3-2.5 would be better than 4-4-2.0 (tRCD-tRP-CAS)
tRAS should always be larger the before mentioned timings. – see below
tRAS is unique, in that lowering it can lead to problems and lesser performance. tRAS is the only timing that has no effect on real performance, if it is configured as it should. By definition, real-life performance is the same with different tRAS settings with a certain exception. This document from Mushkin outlines how tRAS should be a sum of tRCD, CAS, and 2. For example, if you are using a tRCD of 2 and a CAS of 2 on your RAM, then you should set tRAS to 6. At values lower than that theory would dictate lesser performance as well as catastrophic consequences for data integrity including hard drive addressing schemes --- truncation, data corruption, etc --- as a cycle or process would be ended before it's done. How is it possible for memory timings to affect my hard drive? When the system is shut down or a program is closed, physical ram data that becomes corrupted may be written back to the hard drive and that’s where the consequences for the hard drive come in. Also let’s not forget when physical ram data is translated by the operating system to virtual memory space located on the hard drive.
While it's important to consider the advice of experts like Mushkin, your own testing is still valuable. Systems – both AMD & Intel alike, can indeed operate with stability with 2-2-2-5 timings, and even exhibit a performance gain as compared to the theoretically mandated 2-2-2-6 configuration. The most important thing in any endeavor is to keep an open mind, and don't spare the effort. Once you've tried both approaches extensively it will be clear to you which is superior for your particular combination of components.
Dealing with Memory Speeds / Frequencies
When the memory frequency runs at the same speed as the FSB, it is said to be running in synchronous operation. When memory and FSB are clocked differently (lower or higher than), it is known to be in asynchronous mode. On both AMD and Intel platforms, the most performance benefits are seen when the FSB of the processor is run synchronously with the memory – Although Intel based systems have a slight exception, this is completely true of all AMD-supporting chipsets. When looking at the AMD-supporting chipsets async modes are to be avoided like a plague. AMD-supporting chipsets offer less flexibility in this regard due to poorly implemented async modes. Even if it means running our memory clock speed well below the maximum feasible for a given memory, an Athlon XP system will ALWAYS exhibit best performance running the memory in sync with the FSB. Therefore, a 166FSB Athlon XP would run synchronously with DDR333/PC2700 (2*166) and give better performance than running with DDR400/PC3200, despite its numbers being bigger.
Only Intel chipsets have implemented async modes that have any merit. If you are talking about the older i845 series of chipsets, running an async mode that runs the memory faster than the FSB is crucial to top system performance. And with the newer dual channel Intel chipset (i865/875 series) in an overclocked configuration, often you must run an async mode that runs the memory slower than the FSB for optimal results. The async modes in SiS P4 chipsets also work correctly.
To achieve synchronous operation, there is usually a Memory Frequency or DRAM ratio setting in the bios of your system that will allow you to manipulate the memory speed to a either a percentage of the FSB (i.e. 100%) or a fraction (or ratio) i.e. N/N where N is any integer available to you. If you want to run memory at non 1:1 ratio speeds, motherboards use dividers that create a ratio of CPU FSB: memory frequency. However, intrinsically, it is possible to see the problem with this and why synchronous operation is preferable on all PC platforms. If there is divider, then there is going to be a gap between the time that data is available for the memory, and when the memory is available to accept the data (or vica versa). There will also be a mismatch between the amount of data the CPU can send to the memory and how much the memory can accept from the CPU. This will cause slowdowns as you will be limited by the slowest component.
Here are three examples illustrating the three possible states of memory operation:
200MHz FSB speed with 100% or 1:1 (FSB:Memory ratio) results in 200MHz memory speed (DDR400)
Such a configuration is wholly acceptable for any AMD system, memory should be set this way at all times for best performance. Asynchronous FSB/Memory Speeds are horridly inefficient on AMD systems, but may well be the optimal configuration for P4 systems.
200MHz FSB speed with 120% or 5:6 (FSB:Memory ratio) results in 240MHz memory speed (DDR480)
This example shows running the memory at higher asynchronous speeds. Assume we have a Barton 2500+ which by default is running at a FSB of 333 MHz (166 MHz X 2) and we also have PC3200 memory which by default is running at 400 MHz. This is a typical scenario because many people think that faster memory running at 400 MHz, will speed up their system. Or they fail to disable the SPD or Auto setting in their bios. There is NO benefit at all derived from running your memory at a higher frequency (MHz) than your FSB on Athlon XP/Duron sytems. In actuality, doing so has a negative effect.
Why does this happen? It happens because the memory and FSB can't "talk" at the same speeds, even though the memory is running at higher speeds than the FSB. The memory would have to "wait for the FSB to catch up", because higher async speeds forces de-synchronization of the memory and FSB frequencies and therefore increases the initial access latency on the memory path -- causing as much as a 5% degradation in performance.
This is another ramification of the limiting effect of the AMD dual-pumped FSB. A P4's quad pumped FSB (along with the superior optimization of the async modes) allows P4's to benefit in some cases from async modes that run the memory faster than the FSB. This is especially true of single channel P4 systems. There still are synchronization losses inherent in an async mode on any system, but the adequate FSB bandwidth of the P4 allows the additional memory bandwidth produced by async operation to overcome these losses and produce a net gain.
250MHz FSB speed with 80% or 5:4 (FSB:Memory ratio) results in 200MHz memory speed (DDR400)
This example is most often used in overclocking situations where the memory is not able to keep up with the speed of the FSB. On AMD platforms, there is really no point having a high FSB, if the memory can’t keep up. When the memory or any other component is holding back system performance, this is called a “bottleneck”. As in the example above, a memory bottleneck would be if you were running your memory at DDR400 MHz with a 500 MHz (250x2) system bus. The memory would only be providing 3.2GB/s of bandwidth while the bus would be theoretically capable of transmitting 4.0GB/s of bandwidth. A situation like this would not help overall system performance.
Think of it like this; let's say you had a highway going straight into a mall, with an identical highway going straight out of the mall. Both highways have the same number of lanes and initially they have the same 45mph speed limit. Now let's say that there's a great deal of traffic flowing in and out of the mall and in order to get more people in and out of the mall quicker, the department of transportation agrees to increase the speed limit of the highway going into the mall from 45mph to 70mph; the speed limit of the highway leaving the mall is still stuck at 45mph. While more people will be able to reach the mall quicker, there will still be a bottleneck in the parking area leaving the mall - since the increased numbers of people that are able to get to the mall still have to leave at the same rate. This is equivalent to increasing the FSB frequency but leaving the memory frequency/bandwidth unchanged or set to a slower speed. You're speeding up one part of the equation while leaving the other part untouched. Sometimes the fastest memory is not always afforded or available. In this case, more focus should be placed on balancing the FSB and memory frequencies while still keeping latencies as low as possible AND while still maintaining CPU clock speed (GHz) by increasing the multiplier. The benefit of a faster FSB (and higher bandwidth) will only become more and clearer as clock speeds (GHz) increase; the faster the CPU gets, the more it will depend on getting more data quicker. The only real benefit of async modes on AMD platforms is the fact that it comes in handy to overclockers for testing purposes; to determine their max FSB and to eliminate the memory as a possible cause for not being able to achieve a desired stable FSB speed. Even so, async modes on early nforce2 based motherboards caused many problems; problems as serious as bios corruption.
Looking to the Intel side of the fence, async modes that run the memory slower than the FSB have merit because of how async modes are implemented in the Intel chipsets. This is extremely important, as we cannot change the CPU multiplier on modern Intel systems and therefore have to use and async mode to allow substantial overl!!!!! on the majority of systems utilizing the current 200/800MHz fsb family of P4 processors. To illustrate, if you increase the FSB on a new C stepped P4 to 250 MHz (250 x 4) with a 1:1 ratio, memory will work at 250 MHz (DDR500). This can be done in two ways. The first is with exotic PC4000 or DDR500 memory modules, but these are expensive just to run synchronously at such speeds and their timings are exactly delightful either. The other way is to overclock DDR400/DDR433 to much higher speeds through overvolting, but this is seemingly dangerous and often motherboards don’t provide nearly enough voltage to achieve such speeds without physical voltage mods. Therefore to avoid expensive PC4000 or volt mods, you change the memory ratio so that a 250FSB overclock will become something that the memory can handle to allow for a substantial overclock of the Pentium 4. In the example, to let PC3200(DDR400) remain as DDR400 with a 250MHz.
Buying Memory
Memory Buying
Very touchy subject for some. For others, RAM is RAM, right? Right!!
If you plan on just running at stock speeds, then your hunt for memory just became easier. Not by a lot, but nevertheless, easier. With AMD platforms, the requirements for memory are more varied, due to the fact, that there are several different models of processors, many of which are utilizing different bus speeds. Higher end Athlon XPs, like the 3000+ and 3200+, require the use of at least DDR400 while lower end ones may be satisfied with the use of DDR333 or DDR266 modules. Newer C Stepped P4s, all require DDR400 for optimal performance. My only suggestion is to never buy generic RAM and for good reason. There are three factors that go into the quality of a memory module: the quality of the chips, the quality of the printed circuit board (PCB), and manufacturing. None of these factors are present on poorly manufactured modules. How do you tell what’s poorly manufactured? Simple, those are cheap and you don’t see a name of a company attached to it. So really, when buying memory for typical operation, buy something that’s fairly well known. As an IT professional, I work on a large network. I'll conservatively estimate that farm of PCs has 500 memory modules collectively. We buy name-brand memory (Crucial & Kingston Value) exclusively.
If your intention is to overclock, I can't stress this enough - YOU NEED HIGH QUALITY COMPONENTS!! I can't tell you how many times I've tried to help people out with overclocking their systems and come find out that the one problem was they have no-name PC2100 dimms stuffed in their slots. Ick. One thing I can say with certainty is that you should buy PC3200/PC3500 at the bare minimum for any new system - AMD or Intel – if your sole purpose is to overclock. Not only for overclocked systems but for the sake of performance in general. Experience dictates that the advantages of fast memory are worth the slightly higher price that you have to pay to get it. Depending on the brand there isn't really much of a price differential between quality PC2700 and PC3200, to not get PC3200. Quality RAM is not always expensive, but expensive RAM is often quality.
However, buying much faster RAM isn't always the best idea, especially for AMD chipsets. Overall, PC4000 and higher modules are not quite compatible with these motherboards. High speed DDR (PC3700 and all the way up to PC4500) generally, sacrifice all the useful functions (i.e. lower timings, compatibility with motherboards, ability to run in async mode) for the sake for attaining stable operation at high speeds. An individual buying these speeds of memory must be in a sound state of mind, and with a money-back guarantee or enough budget to avoid disappointment in the event of unsatisfactory results. Otherwise, it is advisable to stick with lower latency PC3200 or PC3500 modules. Consider that Fighter Jet A is your PC4000 memory and is built for super-fast speeds, but cannot maneuver as well as Fighter Jet B which is lower latency PC3200. In a dog fight on AMD terrain, Fighter Jet B will win because the terrain is mountainous and requires more maneuverability. Likewise, Fighter Jet A would have more of a chance on Intel terrain because on such a terrain speed matters more and maneuverability (or latency) isn't as important. But since, as discussed earlier, we can realistically use an async mode on Intel systems, it may well prove that doing so and using top quality PC3200 or PC3500 memory and their attendant lower latency will allow Fighter Jet B to triumph in most all cases.
Bottom line is, don't skimp on your RAM selection. You'll be kicking yourself later if you do. But just the same, don't assume that because a particular memory type is expensive it is also superior for your application. Make all effort to avoid generic brands as they are famous for cutting corners during production and burning the wrong values into the SPD chips. Unhappy buyers are forced to struggle with poor performance or system crashes without knowing exactly why.
Why are you recommending PC3500, when my motherboard only supports PC3200?
Memory modules really have no fixed speed. Like the tire to a car, there is a "rating" on it. When a tire is rated to be 150mph, it means it can run as fast as 150mph maximum. It also means that it can run at any speed lower than that. It is also quite safe to say that the tire should also withstand at 160 mph, just not as "safe" according to the Government's test environment.
Memory is very much similar in this way. Many people ask if a PC3500 or PC3700 module would run/blow up/be compatible in a motherboard originally designed to use PC3200 or PC2700. The answer is, hell ya! JEDEC (the “government”) has only approved PC3200. This reason, coupled with the fact that no processor needs memory rated higher than PC3200, are causes for motherboard manufacturers not stating support for newer, faster modules. But higher rated speeds of DDR are always ‘backward compatible’ so to speak, or capable of running at lower speeds. Older systems stand to gain from newer and faster modules. Even if they can't run the module at its top supported frequency, you can still tweak the timing parameters to maximize performance at lower clock speeds, that otherwise would not be possible with lower-rated modules.
From "ATi" over @ 3DMaXX