Oracle Asm Tutorial from Ocm John Watson And Skillbuilders

How to Tune Your Disc I/O: Let ASM Do It for You Introduction to ASM and the Oracle Storage Model [music] >> John: So moving on, what is ASM in terms of a general functionality? The phrase that some people use is a database-centric file system, which is not bad. It's a file system in that it can manage files but the types of files it can manage are database files. You cannot use it for anything else. The Oracle description, is that it's just portable - yes, it is portable up to a point. Theoretically, ASM is available on all platforms, which is true and the same on all platforms, which is not quite correct. Conceptually it is portable but the platform specific implementation details can be difficult. High performance? Definitely. The performance of ASM can be astronomical. But, it does require a set amount of tuning and you need to be aware of the type of application to which ASM is suitable, it’s suitable for all applications but the type of application where it'll perform best and in particularly hardware environment on which it sits. Fault tolerance. Definitely. Oracle is a bit ambivalent about this and that's with earlier releases, there is a general recommendation that if your hardware could do RAID mirroring you should let the hardware do it. With the current release, Oracle's increasingly tending to say, "Don't use your hardware RAID for mirroring. Let ASM do it all." Clusterable? Definitely. As I said, I don't want to discuss that today. And Oracle will often tell you that they can replace your third party RAID systems. Our experience is slightly different. One thing I definitely want to emphasize that is not properly documented anywhere is the extent based nature of ASM. Extent based file systems are beginning to come into use. Those of you who work with Linux, for example, they'll know that with Linux 6 we have the ext4 file system, which is extent based. The ext4 has now been back ported I think to early releases. You can get it on Linux 5 distributions now as well. But that's going from ext3 to ext4. It shows you how powerful extent based file systems can be for performance. The performance of ASM though, largely because it is extent based, is potentially great but only if it's configured appropriately with your hardware and if it's configured appropriately with your RDBMS database. Potentially great performance. [pause] It works with your RAID. I would never position it as replacement for third party RAID unless it really were a pretty low-end system. [pause] It can replace your hardware RAID system, but it doesn't have to. I believe it can work with it very well indeed. That's where we come to the fact that DBAs and the system administrators definitely need to work together. [pause] I also want to emphasize just at the beginning that it is mature technology. It was released as version 10g. In fact, then it was bit ropey. But it matures, it works, but it had issues. The version 11.1, 11.2 became a lot more stable, and 12c has some terrific advantages beyond that. To begin with, I want to start off very very simple with going back to basics just to get rid of the fact the DBA as such in this day-to-day administration, the DBA doesn't really need to do anything. >> Dave: Pardon, John, quick question in the queue. Are there any benchmarks of ASM versus files systems? >> John: Ooh, benchmarks. Hmm. If you Google around, you'll find one or two. And the published benchmarks are a bit thin on the ground. For example, I was looking at one the other day - Red Hat had published a benchmark where they benchmarked ASM against their own GFS (Global File System) and funnily enough, it showed that Global File System could outperform ASM. I can't think why that would happen in the Red Hat benchmark. I looked into that benchmark and I'm pretty sure I know why they got such poor performance from ASM. It's because it was totally non-optimized. I would never have set it up that way. Oracle's official benchmark published way back with version 10 benchmarked ASM against Veritas, which I believe is said to be the best file system in the world. And ASM outperformed Veritas by about 15%, which isn't much until again you look into that benchmark and it turned out that the ASM test was just some ordinary DBA like me - you just install it, click, click, click, run the benchmark, here are the results. And Veritas - they had a Veritas engineer tuning the system for nearly two weeks and even then couldn't get comparable performance. And that was impressive for me. [pause] But the unpublished benchmarks I've seen have been incredible. That's a cell phone company you're working with here in England and they benchmarked ASM against HP Quick I/O and the performance they produced, ASM is outperforming HP Quick I/O two to three times and that was for a standard IN platform switching system for the cell phone company. Another one I saw, an unpublished benchmark was from an investment bank and the chap there, the DBA reckoned he was getting five times the performance after the change from file systems storage to ASM storage, five times. I wouldn't say that many people would get that much of improvement I have to say. I think the nature of their application which was large scale data warehouse was absolutely perfect for ASM. So, benchmarks – look around. You'll find a few, but not many I'm afraid. First thing I want to get rid of is there's a bit of fear, uncertainty, doubt about ASM that perhaps your more junior DBAs, the people who do the day-to-day administration to the systems may have to learn a lot. That isn't true. Remember from within the RDBMS environments, there is no change at all. If I want to create a tablespace, what do I do? I create a tablespace. All we do is absolutely normal – create tablespace command. [pause] Our datafile name I specified as +cwfiles and that is the indicator that this is going to be on an ASM disk group. But there's no difference apart from that. If we look at what's going on, given the environment, jQuery, DBA datafile, some v$datafile. Oops. Spelling mistake there. [pause] I was looking at DBA datafile at this point like name of v$datafile. That's more like it. There we see the file could have been created. Nothing special about the [07:04 inaudible] file. It's like any other. [pause] If your DBAs don't want to use Oracle-managed files, then of course they can specify a file name as I'm doing there, specifying a file name at that point. We're only getting these automatically generated file names. Just bear in mind your data administration has absolutely nothing to worry about. Your ASM is just the same as any other Oracle system. [pause] A real quick review of the Oracle storage model and this beginning description of the architecture will make us understand why ASM performs so well. As within the relational database management system, Oracle of course separates logical storage from physical. ASM storage doesn't change that. The logical storage model is not identical to that with C without using ASM. But ASM plugs in and replaces the physical storage model. [pause] What can it replace? The database storage. What are the true database files? Datafiles, tempfiles, online logfiles, controlfile – they can all go on ASM. Recovery and exit files, anything from a flash recovery area (or fast recovery area I should call it) can go onto an ASM disk group. Spfile and the next release password file, data pump dumps – they can all go onto ASM disk groups. What you cannot use it for or where ASM will not replace your conventional physical storage is the non-database files. That means the binaries, your Automatic Diagnostic Repository, annoyingly Bfile blobs. The files that are completely external to the database cannot go on ASM unless you use the ACFS, the Oracle ASM Cluster file system. But that would be a different topic. [pause] So, we're replacing a physical storage with database files with ASM storage. It looks like this. [pause] Being a database administrator, I am of course infatuated with normalization. We know how logical storage works - logical storage on the left. Logically, Oracle stores data in segments, segments live in tablespaces are divided into segment extents, consisting of multiple Oracle blocks. And your conventional physical storage, tablespaces and multiple datafiles are divided into operating system blocks. [pause] Think of the structure – the logical storage on the left and the conventional physical storage in the middle here. This storage model has some horrible implications. Starting at the bottom, if you want to read a row, of course Oracle must read the entire Oracle block from disk. Okay, no problem. What's the operating system reading? [pause] To read that one Oracle block, the operating system has to work out where that block actually resides. Take an example, say Linux ext3. Your OS block is probably 1 kilobyte, your Oracle block maybe 8 kilobytes. So Linux has to read 8 OS blocks. Those 8 operating system blocks could be anywhere on the disk. The way the inode will have distributed the file, there's no reason to assume the blocks are adjacent. There could be 8 I/Os from all over the disk. Maybe you're not using straightforward file system like that. Perhaps you're using RAID. When you use RAID strike, then you're getting a reverse problem. Your RAID stripe might be say 128 k. So to read an 8 k Oracle block, you need 128 k stripe. [pause] How much of that 128 k do you want? You want less than one tenth of it. You're throwing away over 90% of your disk I/O. There you see a problem with this. You've got dreadful issues of the fragmentation. If your system administrators and your file system are very good and can manage to configure the RAID strikes appropriately, maybe that 128 k stripe is all of one datafile. No guarantee of course, it depends on the file system the 128 k may be one datafile. Is that going to be of one segment? Of one segment extent? Chance in a thousand because Oracle distributes the segments throughout the physical datafile. Those are the inefficiencies that we've been living with for a long time. So now we move over to the ASM structure which replaces this physical structure. The datafile becomes an ASM file. No problem with that at all. ASM files live on disk groups. There's a many to one relationship. Disk groups consist of many disks. We'll talk about disk groups and disks and what a disk actually is shortly. [pause] Your disks are formatted into allocation units. You can draw an analogy in OS block at that point. But it is only an analogy. And then the ASM files are divided into file extents which consist of multiple allocation units. [pause] Why does this work so much better? First, with the allocation units. Default is 1 megabyte, but normally I'll be tuning that quite substantially in conjunction with you but by default one megabyte. That increases your I/O potential hugely. In the normal file system, the I/O units let's say one block or even one stripe is far smaller than the allocations units can be. In a fairly standard change you see when you go to ASM is the unit of I/O increase dramatically. That's far more efficient use of your hardware. But best of all is the file extents. If we configure ASM and tablespaces correctly, we will be aligning the segment extents with the file extents. [pause] This means that when we read that 1 megabyte allocation unit, every single byte of that 1 megabyte I/O is absolutely guaranteed to be consecutive Oracle blocks of the same segment. We are not wasting one byte of I/O. This is where the astronomical performance can come in. We align the segment extents with the allocation units, the segment extent boundaries, the allocation unit boundaries, and the file extent boundaries. Comparatively, you get huge units of I/O. Absolutely every unit of I/O is the blocks we want. This gives you huge advantages for your full scan operations – table full scan and index fast full scan. That's why data warehouse operations environments tend to perform so well in the ASM environments. OLTP you don't get the same benefits, but less. You're still getting much more efficient use of the environments. [pause] >> Dave: Pardon, John. This seems to be such a big takeaway as part of this presentation. Just to recap, in the old model we're suffering two problems. First of all, fragmentation. [pause] The physical extents were not necessarily contiguous on the disk. [pause] >> John: That's very true. I've never really understood how UNIX inodes work that way. We know it's more efficient than say a Windows file allocation table. But that fragmentation is going to be a problem. [pause] >> Dave: Another problem I think you said was that we might be reading 128 k in a single I/O but only a fraction of that 128 k might belong to that particular segment. [pause] >> John: That's precisely it. That situation can be improved if you go to more modern extent based file systems. But you need to make that change. >> Dave: And ASM is solving both of those problems. >> John: And is doing it automatically. >> Dave: We got a question on the queue as well. What is the maximum size and number of disk groups, disks or files? [pause] >> John: I have to go the docs to be certain on this, but the scalability of this thing is just ludicrous. If I remember correctly, this is certain, but the current release you are limited to 63 of these disk groups. That is in fact the limitation some people hit particularly in the Exadata environments where you may be consolidating many databases. So, 63 disk groups. With release 12c there's up to 511 - the problem is then fixed. [pause] However, within one disk group, you can have 10,000 disks and the size of each disk is something ludicrous. It's in the terabytes. So, up to 10,000 disks of effectively any size, up to 63 disk groups. If I remember correctly, one disk group can contain up to a million files and the maximum file size is petabytes. The scalability I never even really bothered to memorize the figures. The scalability is astronomical.