.NET Explored

Visual Studio 2010 gets much needed Design and Architecture support

21 December 2009 | No Comments »

For all these years if a .Net architect has to model the software system. He or she has to rely on modeling tools like Rational XDE, or Visio Enterprise Architect. I had tried my hands on Visio Enterprise architect’s modeling support and code generation it has to offer. But I was not impressed by it even though Visio has several stencils, templates, symbols available; UML modeling and associated code generation was always bit cumbersome also it was difficult to sync up models with code and vice a versa was another challenge.  3rd party tools like Rational XDE has good support for .NET but one has to pay hefty license fees to use such tools.

Result of this, my system modeling used to get constrained into Microsoft Word, Power Points, Visio’s. Keeping Word/Visio based models up to date w.r.to the architecture, design, code changes was always a catching game leading the code, designs and overall system documentation out of synch impacting traceability between these artifacts.

Increasingly this has caused the disharmony between architecture modeling and development teams.

After all these years, finally, Microsoft Visual Studio 2010(VS2010) seem to have helped overcome this obstacle by embracing Unified Modeling Language (UML) and making architecture, design, development, testing seamlessly possible through its Integrated development environment

Given below are some of the new design & Architecture feature

  1. Architecture Explorer – to discover and identify existing code assets and architecture in number of ways including graphs, stacked diagrams and dependency matrices.
  2. Ability to create and share various types of diagrams like use case, activity and sequence diagrams.
  3. Modeling tools that are tightly integrated with code and thus helps in keeping model and code in sync.
  4. Architectural Validations – Ways to put constraints on code using models and doing validations at the time of check-ins and builds.
  5.  Architecture Layer Diagram – One of the most useful and simple tool getting introduced. It allows representing your application architecture in form of layers and showing dependencies between them. It also allows to map physical components like classes, namespaces etc. to map to these layers. After all mappings you can validate whether the code meets the expected mappings and constraints.
  6.  Microsoft joins OMG and UML gets introduced in Visual Studio 2010.
  7.  Support for UML 2.1.1 – 5 out of 13 diagrams – use case, component, activity, class and sequence diagrams.
  8. Ability to keep all the UML diagrams in sync so that a change in one automatically reflects on others.
  9.  Will be interoperable with Visio 1.1 templates.
  10.  Supports Top to bottom design approach.
  11. Supports Bottom to top design approach – reverse engineer. Filter based on namespaces, number of level deep.
  12.  Model Explorer – Similar to Solution Explorer, this allows you to explore all the models you have created which includes objects created as part of various UML diagrams – logical view.
  13.  Ability to create a Sequence Diagram from existing source, simply by right clicking in the VS code editor and selecting “Generate Sequence Diagram…”

 

However this is not all with VS2010 architecture modeling support.  I will get into more details of it when I explore more features.

  • Share/Bookmark

Microsoft’s answer to Ruby on Rails – Dynamics Data

18 December 2009 | No Comments »

ASP.NET Dynamic Data brings major usability and RAD development changes to the existing ASP.NET data controls. RAD development is significantly increased by the use of a rich scaffolding framework. After you add a LINQ to SQL or Entity Framework data model to a project, you can simply register it with Dynamic Data. The result is a fully functional Web site. Full CRUD (create, read, update, and delete) operations are supported. The site includes filtering by foreign keys and Boolean fields; foreign keys are automatically converted to their friendly names. Smart validation is automatically available, which provides validation based on database constraints for nullable fields, data type, and field length. The DetailsView and GridView controls have been extended to display fields by using templates instead of by using hard-coded rules that are programmed in the controls. These templates are part of the project, and you can customize them to change their appearance or to specify which controls they use for rendering. This makes it very easy to make a change in one place in your site that specifies how to present dates for editing, as one example. FormView and ListView controls can implement similar behavior by using a DynamicControl object in their templates and by specifying which field in the row to display. Dynamic Data will then automatically build the UI for these controls based on the templates that you specify. Validation is significantly improved in the controls as well. The controls read metadata for a LINQ to SQL or Entity Framework data model and provide automatic validation based on the model. For example, if a column in the database is limited to 50 characters, and if a column is marked as not nullable, a RequiredFieldValidator control is automatically enabled for the column. (The controls also automatically support data-model-level validation.) You can apply other metadata to take further control over display and validation. In a nutshell this enables you to really quickly build data driven web-sites that work against a LINQ to SQL (and in the future LINQ to Entities) object model – and optionally allows you to-do this without having to build any pages manually.

  • Share/Bookmark

VB 6.0 Migration

10 December 2009 | No Comments »

VB6 Migration Library

 

Visual Basic 6.0 runtime will be supported on Windows Server 2003 until June 2008 for Mainstream Support and June 2013 for Extended Support. The IDE will move out of extended support April 8, 2008.

 

Below are a series of links that relate to VB6 and VB6 Migration to VB.NET.

ASP to ASP.NET Migration Assistant:

The ASP to ASP.NET Migration Assistant is designed to help you convert ASP pages and applications to ASP.NET. It does not make the conversion process completely automatic, but it will speed up your project by automating some of the steps required for migration.

In this guide, you will find:

  • Instructions on how to download and install the migration assistant and accompanying code sample
  • An introduction to the ASP to ASP.NET Migration Assistant.
  • A comprehensive set of white papers on technical conversion issues.
  • Source code for a Web site before and after conversion.
  • Extensive guidance on how to best leverage ASP.NET, including new best practices material from the Prescriptive Architecture Guidance (PAG) group.

This guide contains three sections

  1. Download and Install the ASP to ASP.NET Migration Assistant
    Visit http://www.asp.net/migrationassistants/asp2aspnet.aspx and follow the instructions to download and install the ASP to ASP.NET Migrations Assistant and the accompanying code sample.
  2. Using the ASP to ASP.NET Migration Assistant
    Moving to ASP.NET is generally accomplished in two steps. First you port the existing ASP application to ASP.NET and then you can optimize the application to fully leverage the .NET Framework. This section covers the initial migration using the Migration Assistant to automate some of the steps.
  3. Optimizing Your Migrated Site For ASP.NET
    Sites migrated using the Migration Assistant can leverage the benefits of the .NET Framework, but there are still many optimizations that can be made to fully utilize the power of ASP.NET. This section describes how to optimize the ASP.NET site ported from ASP in the previous section.

 

MS WhitePaper:

Moving from Visual Basic to ASP.NET
With ASP.NET and Visual Studio .NET, creating Web applications and stand-alone Windows desktop applications are becoming near identical tasks. Explore how Visual Basic 6.0 developers can easily move their skills to the Web using ASP.NET.   

Converting ASP to ASP.NET
An examination of a typical data-driven ASP application, and a discussion of the essential steps involved in porting that ASP application to ASP.NET.

  • Share/Bookmark

.NET 1.1 to .NET 2.0 Migration FAQ

8 December 2009 | No Comments »

NET 1.1 to .NET 2.0 Migration  FAQ

Q: “How much work will it take to move from 1.1 to 2.0 and how long will it take?” 

 A: The answer is “It depends”.  Some applications move over with the simple click of a button, and others take a little more work than that.

 

Q: “Do you need to migrate at all?” 

 A: If it is an application that is still “alive” and moving forward, then the answer is “Yes, you should migrate.”  then keep reading   If you have a 1.1 application that is working just fine, and you don’t plan on modifying/updating it, leave it as it is and it should work just fine side by side.

 

Q: What happens when you load/run a .NET 1.1 application on a machine that has both 1.1 and 2.0 installed?

 A:

 

Application type Computer with 1.1 Computer with 2.0 Computer with 1.1 and 2.0
1.1 stand-alone application (Web or Microsoft Windows client) Loads with 1.1 Loads with 2.0 Loads with 1.1
2.0 stand-alone application (Web or Microsoft Windows client) Fails Loads with 2.0 Loads with 2.0
1.1 add-in to a native application (such as Office or Internet Explorer) Loads with 1.1 Loads with 2.0 Loads with 2.0 unless the process is configured to run against 1.1
2.0 add-in to a native application (such as Office or Internet Explorer) Fails Loads with 2.0 Loads with 2.0

 

Given that .NET was designed for side-by-side usage since it first came out, that 1.1 application should run fine in a side-by-side environment.  It’s the scenarios in bold that you need to really pay attention to with regards to testing.  What will catch folks off-guard are the scenarios in the third row in the table.  The gotcha is that you might not even be aware of what applications cause the third row scenarios. 

 

The 2.0 framework is mostly backwards compatible with the 1.1 framework.  Meaning, most 1.1 applications should run okay on the 2.0 framework.  Therefore, the scenarios in bold should work out for you most of the time if you choose not to migrate your 1.1 application forward.  But (and there is always a ‘but’), there are some breaking changes in the 2.0 framework.  

 

Here’s where you can find out about breaking changes in the 2.0 Framework.  Use this list ahead of time to analyze your 1.1 application and anticipate where issues might crop up:

http://msdn.microsoft.com/en-us/library/ms994364.aspx

 

Then, once you have a heads up on what could go wrong, test out the scenarios in the matrix where your 1.1 application could be loaded using the 2.0 framework. Here are some test scenarios to consider:

http://msdn.microsoft.com/en-us/library/ms994387.aspx

 

If your 1.1 application fails to run on 2.0 for whatever reason, and you still need to support the scenario in the last column of the third row, you can make some configuration changes to force the application to run using the 1.1 framework.  Check these two links for more information:

  1. http://msdn.microsoft.com/en-us/library/s80xxs7s(VS.71).aspx
  2. http://blogs.digineer.com/blogs/tabraham/archive/2005/12/09/15.aspx

  

Q: I want to migrate my application from .NET 1.1 to .NET 2.0.  Where do I start?

 

The quick and dirty answer:  Back up your VS 2003/1.1 solution first!!!  Then, open your Visual Studio 2002 or 2003 project/solution in Visual Studio 2005 or 2008.  A conversion wizard will convert the project/solution to 2005.  Compile the code, and you will now have a .NET 2.0 application!

 

The real answer:  If you’re migrating a Windows Forms application, the majority of times, the quick and dirty answer above will be all it takes for you!  If you’re migrating an ASP.NET web application, you should do a little homework before you open your .NET 1.1 project/solution in Visual Studio 2005 or 2008.

 

Let’s talk about non-web applications first.  As I mentioned, most of the time, opening the application in Visual Studio 2005 or 2008 and running it through the conversion wizard should be all it takes.  This will merely update the project files (.vbproj/.csproj) to work with 2005.  It will not update your application code to take advantage of the new .NET 2.0 features.  At this point, when you compile your existing 1.1 code, it will be compiled against the 2.0 framework.   If the code doesn’t compile, check out the list of breaking changes in the 2.0 framework mentioned above and begin troubleshooting from there.

 

For migrating web applications, there can be more work involved since the project model for web applications has changed greatly in Visual Studio 2005 and 2008.  Fortunately migration information is centraly located on MSDN’s ASP.NET migration center.

 

Tech Articles:

 

Common ASP.NET 2.0 Conversion Issues and Solutions
Solve some of the common conversion issues developers may face when upgrading from ASP.NET 1.x to 2.0.

What’s New in Web Development for Visual Studio 2005
Visual Web Developer continues to bring you the productivity benefits of the Visual Web Developer integrated development environment (IDE) while introducing a wide array of improvements.

Migrating from ASP to ASP.NET 2.0
Learn the concepts and tools you will need to develop applications in ASP.NET 2.0 coming from classic ASP. Explore the benefits of Microsoft’s newest Web technology and understand how to transition from ASP to ASP.NET 2.0.

ASP.NET 2.0 QuickStart Tutorial
The ASP.NET QuickStart is a series of ASP.NET samples and supporting commentary designed to quickly acquaint developers with the syntax, architecture, and power of the ASP.NET Web programming framework.

  • Share/Bookmark

Top 10 SQL Server Integration Services Best Practices

5 December 2009 | No Comments »

How many of you have heard the myth that Microsoft® SQL Server® Integration Services (SSIS) does not scale? The first question we would ask in return is: “Does your system need to scale beyond 4.5 million sales transaction rows per second?” SQL Server Integration Services is a high performance Extract-Transform-Load (ETL) platform that scales to the most extreme environments. And as documented in SSIS ETL world record performance, SQL Server Integration Services can process at the scale of 4.5 million sales transaction rows per second.

 
  

1 34x34 Top 10 SQL Server Integration Services Best Practices

SSIS is an in-memory pipeline, so ensure that all transformations occur in memory.

The purpose of having Integration Services within SQL Server features is to provide a flexible, robust pipeline that can efficiently perform row-by-row calculations and parse data all in memory.

While the extract and load phases of the pipeline will touch disk (read and write respectively), the transformation itself should process in memory. If transformations spill to disk (for example with large sort operations), you will see a big performance degradation. Construct your packages to partition and filter data so that all transformations fit in memory.

A great way to check if your packages are staying within memory is to review the SSIS performance counter Buffers spooled, which has an initial value of 0; above 0 is an indication that the engine has started swapping to disk. For more information, please refer to Something about SSIS Performance Counters.

 

2 34x34 Top 10 SQL Server Integration Services Best Practices

Plan for capacity by understanding resource utilization.

SQL Server Integration Services is designed to process large amounts of data row by row in memory with high speed. Because of this, it is important to understand resource utilization, i.e., the CPU, memory, I/O, and network utilization of your packages.

CPU Bound
Seek to understand how much CPU is being used by Integration Services and how much CPU is being used overall by SQL Server while Integration Services is running. This latter point is especially important if you have SQL Server and SSIS on the same box, because if there is a resource contention between these two, it is SQL Server that will typically win – resulting in disk spilling from Integration Services, which slows transformation speed.

The perfmon counter that is of primary interest to you is Process / % Processor Time (Total). Measure this counter for both sqlservr.exe and dtexec.exe. If SSIS is not able to drive close to 100% CPU load, this may be indicative of:

  • Application contention: For example, SQL Server is taking on more processor resources, making them unavailable to SSIS.
  • Hardware contention: A common scenario is that you have suboptimal disk I/O or not enough memory to handle the amount of data being processed.
  • Design limitation: The design of your SSIS package is not making use of parallelism, and/or the package uses too many single-threaded tasks.

Network Bound
SSIS moves data as fast as your network is able to handle it. Because of this, it is important to understand your network topology and ensure that the path between your source and target have both low latency and high throughput.

The following Network perfmon counters can help you tune your topology:

  • Network Interface / Current Bandwidth: This counter provides an estimate of current bandwidth.
  • Network Interface / Bytes Total / sec: The rate at which bytes are sent and received over each network adapter.
  • Network Interface / Transfers/sec: Tells how many network transfers per second are occurring. If it is approaching 40,000 IOPs, then get another NIC card and use teaming between the NIC cards.

These counters enable you to analyze how close you are to the maximum bandwidth of the system. Understanding this will allow you to plan capacity appropriately whether by using gigabit network adapters, increasing the number of NIC cards per server, or creating separate network addresses specifically for ETL traffic.

I/O Bound
If you ensure that Integration Services is minimally writing to disk, SSIS will only hit the disk when it reads from the source and writes to the target. But if your I/O is slow, reading and especially writing can create a bottleneck.

Because tuning I/O is outside the scope of this technical note, please refer to Predeployment I/O Best Practices.  Remember that an I/O system is not only specified by its size ( “I need 10 TB”) – but also by its sustainable speed (“I want 20,000 IOPs”).

Memory bound
A very important question that you need to answer when using Integration Services is: “How much memory does my package use?”

The key counters for Integration Services and SQL Server are:

  • Process / Private Bytes (DTEXEC.exe) – The amount of memory currently in use by Integration Services. This memory cannot be shared with other processes.
  • Process / Working Set (DTEXEC.exe) – The total amount of allocated memory by Integration Services.
  • SQL Server: Memory Manager / Total Server Memory: The total amount of memory allocated by SQL Server. Because SQL Server has another way to allocate memory using the AWE API, this counter is the best indicator of total memory used by SQL Server. To understand SQL Server memory allocations better, refer to Slava Ok’s Weblog.
  • Memory / Page Reads / sec – Represents to total memory pressure on the system. If this consistently goes above 500, the system is under memory pressure.

 

3 34x34 Top 10 SQL Server Integration Services Best Practices

Baseline source system extract speed.

Understand your source system and how fast you extract from it. After all, Integration Services cannot be tuned beyond the speed of your source – i.e., you cannot transform data faster than you can read it.
Measure the speed of the source system by creating a very simple package reading data from your source with the a destination of “Row Count”:

Execute the package from the command line (DTEXEC) and measure the time it took for it to complete its task. Use the Integration Services log output to get an accurate calculation of the time. You want to calculate rows per second:

Rows / sec = Row Count / TimeData Flow

Based on this value, you now know the maximum number of rows per second you can read from the source – this is also the roof on how fast you can transform your data. To increase this Rows / sec calculation, you can do the following:

  • Improve drivers and driver configurations: Make sure you are using the most up-to-date driver configurations for your network, data source, and disk I/O. Often the default network drivers on your server are not configured optimally for the network stack, which results in performance degradations when there are a high number of throughput requests. Note that for 64-bit systems, at design time you may be loading 32-bit drivers; ensure that at run time you are using 64-bit drivers.
     
  • Start multiple connections: To overcome limitations of drivers, you can try to start multiple connections to your data source. As long as the source can handle many concurrent connections, you may see an increase in throughput if you start several extracts at once. If concurrency is causing locking or blocking issues, consider partitioning the source and having your packages read from different partitions to more evenly distribute the load.
     
  • Use multiple NIC cards: If the network is your bottleneck and you’ve already ensured that you’re using gigabit network cards and routers, then a potential solution is to use multiple NIC cards per server. Note that you will have to be careful when you configure multiple NIC environments; otherwise you will have network conflicts.

 

4 34x34 Top 10 SQL Server Integration Services Best Practices

Optimize the SQL data source, lookup transformations, and destination.

When you execute SQL statements within Integration Services (as noted in the above Data access mode dialog box), whether to read a source, to perform a look transformation, or to change tables, some standard optimizations significantly help performance:

  •  Use the NOLOCK or TABLOCK hints to remove locking overhead.
  • To optimize memory usage, SELECT only the columns you actually need. If you SELECT all columns from a table (e.g., SELECT * FROM) you will needlessly use memory and bandwidth to store and retrieve columns that do not get used. .
     
  • If possible, perform your datetime conversions at your source or target databases, as it is more expensive to perform within Integration Services..
     
  • In SQL Server 2008 Integration Services, there is a new feature of the shared lookup cache. When using parallel pipelines (see points #8 and #10 below), it provides a high-speed, shared cache. .
     
  • If Integration Services and SQL Server run on the same server, use the SQL Server destination instead of the OLE DB destination to improve performance..
     
  • Commit size 0 is fastest on heap bulk targets, because only one transaction is committed. If you cannot use 0, use the highest possible value of commit size to reduce the overhead of multiple-batch writing.  Commit size = 0 is a bad idea if inserting into a Btree – because all incoming rows must be sorted at once into the target Btree—and if your memory is limited, you are likely to spill.  Batchsize = 0 is ideal for inserting into a heap. For an indexed destination, I recommend testing between 100,000 and 1,000,000 as batch size.
     
  • Use a commit size of <5000 to avoid lock escalation when inserting; note that in SQL Server 2008 you can now enable/disable lock escalation at the object level, but use this wisely.
     
  • Heap inserts are typically faster than using a clustered index. This means that you may want to drop indexes and rebuild if you are changing a large part of the destination table; you will want to test your inserts both by keeping indexes in place and by dropping all indexes and rebuilding to validate..
     
  • Use partitions and partition SWITCH command; i.e., load a work table that contains a single partition and SWITCH it in to the main table after you build the indexes and put the constraints on..
     
  • Another great reference from the SQL Performance team is Getting Optimal Performance with Integration Services Lookups.

 

5 34x34 Top 10 SQL Server Integration Services Best Practices

Tune your network.

A key network property is the packet size of your connection. By default this value is set to 4,096 bytes. This means a new network package must be assemble for every 4 KB of data. As noted in SqlConnection.PacketSize Property in the .NET Framework Class Library, increasing the packet size will improve performance because fewer network read and write operations are required to transfer a large data set.
If your system is transactional in nature, with many small data size read/writes, lowering the value will improve performance.

Since Integration Services is all about moving large amounts of data, you want to minimize the network overhead. This means that the value 32K (32767) is the fastest option. While it is possible to configure the network packet size on a server level using sp_configure, you should not do this. The database administrator may have reasons to use a different server setting than 32K. Instead, override the server settings in the connection manager as illustrated below.

Another network tuning technique is to use network affinity at the operating system level. At high throughputs, you can sometimes improve performance this way.

For the network itself, you may want to work with your network specialists to enable jumbo frames to increase the default payload of 1,500 bytes to 9,000 bytes. By enabling jumbo frames, you will further decrease the amount of network operation required to move large data sets.

 

6 34x34 Top 10 SQL Server Integration Services Best Practices

Use data types – yes, back to data types! –wisely.

Of all the points on this top 10 list, this is perhaps the most obvious. Yet, it is such an important point that it needs to be made separately. Follow these guidelines:

  •  Make data types as narrow as possible so you will allocate less memory for your transformation.
     
  • Do not perform excessive casting of data types – it will only degrade performance. Match your data types to the source or destination and explicitly specify the necessary data type casting..
     
  • Watch precision issues when using the money, float, and decimal types. Also, be aware the money is faster than decimal, and money has fewer precision considerations than float.

 

7 34x34 Top 10 SQL Server Integration Services Best Practices

Change the design.

There are some things that Integration Services does well – and other tasks where using another tool is more efficient. Your tool choice should be based on what is most efficient and on a true understanding of the problem. To help with that choice, consider the following points:

  •  Do not sort within Integration Services unless it is absolutely necessary. In order to perform a sort, Integration Services allocates the memory space of the entire data set that needs to be transformed. If possible, presort the data before it goes into the pipeline. If you must sort data, try your best to sort only small data sets in the pipeline. Instead of using Integration Services for sorting, use an SQL statement with ORDER BY to sort large data sets in the database – mark the output as sorted by changing the Integration Services pipeline metadata on the data source.
     
  • There are times where using Transact-SQL will be faster than processing the data in SSIS. As a general rule, any and all set-based operations will perform faster in Transact-SQL because the problem can be transformed into a relational (domain and tuple) algebra formulation that SQL Server is optimized to resolve. Also, the SQL Server optimizer will automatically apply high parallelism and memory management to the set-based operation – an operation you may have to perform yourself if you are using Integration Services. Typical set-based operations include:
     

    • Set-based UPDATE statements – which are far more efficient than row-by-row OLE DB calls.
       
    • Aggregation calculations such as GROUP BY and SUM. These are typically also calculated faster using Transact-SQL instead of in-memory calculations by a pipeline.
       
  • Delta detection is the technique where you change existing rows in the target table instead of reloading the table. To perform delta detection, you can use a change detection mechanism such as the new SQL Server 2008 Change Data Capture (CDC) functionality. If such functionality is not available, you need to do the delta detection by comparing the source input with the target table. This can be a very costly operation requiring the maintenance of special indexes and checksums just for this purpose. Often, it is fastest to just reload the target table. A rule of thumb is that if the target table has changed by >10%, it is often faster to simply reload than to perform the logic of delta detection. 

 

8 34x34 Top 10 SQL Server Integration Services Best Practices

Partition the problem.

One of the main tenets of scalable computing is to partition problems into smaller, more manageable chunks. This allows you to more easily handle the size of the problem and make use of running parallel processes in order to solve the problem faster.

For ETL designs, you will want to partition your source data into smaller chunks of equal size. This latter point is important because if you have chunks of different sizes, you will end up waiting for one process to complete its task. For example, looking at the graph below, you will notice that for the four processes executed on partitions of equal size, the four processes will finish processing January 2008 at the same time and then together continue to process February 2008. But for the partitions of different sizes, the first three processes will finish processing but wait for the fourth process, which is taking a much longer time. The total run time will be dominated by the largest chunk.

To create ranges of equal-sized partitions, use time period and/or dimensions (such as geography) as your mechanism to partition. If your primary key is an incremental value such as an IDENTITY or another increasing value, you can use a modulo function. If you do not have any good partition columns, create a hash of the value of the rows and partition based on the hash value. For more information on hashing and partitioning, refer to the Analysis Services Distinct Count Optimization white paper; while the paper is about distinct count within Analysis Services, the technique of hash partitioning is treated in depth too.

Some other partitioning tips:

  • Use partitioning on your target table. This way you will be able to run multiple versions of the same package, in parallel, that insert data into different partitions of the same table. When using partitioning, the SWITCH statement is your friend. It not only increases parallel load speeds, but also allows you to efficiently transfer data. Please refer to the SQL Server Books Online article Transferring Data Efficiently by Using Partition Switching for more information.
     
  • As implied above, you should design your package to take a parameter specifying which partition it should work on. This way, you can have multiple executions of the same package, all with different parameter and partition values, so you can take advantage of parallelism to complete the task faster.
     
  • From the command line, you can run multiple executions by using the “START” command. A quick code example of running multiple robocopy statements in parallel can be found within the Sample Robocopy Script to custom synchronize Analysis Services databases technical note.

 

9 34x34 Top 10 SQL Server Integration Services Best Practices

Minimize logged operations.

When you insert data into your target SQL Server database, use minimally logged operations if possible. When data is inserted into the database in fully logged mode, the log will grow quickly because each row entering the table also goes into the log.

Therefore, when designing Integration Services packages, consider the following:

  • Try to perform your data flows in bulk mode instead of row by row. By doing this in bulk mode, you will minimize the number of entries that are added to the log file. This reduction will improve the underlying disk I/O for other inserts and will minimize the bottleneck created by writing to the log.
     
  • If you need to perform delete operations, organize your data in a way so that you can TRUNCATE the table instead of running a DELETE. The latter will place an entry for each row deleted into the log. But the former will simply remove all of the data in the table with a small log entry representing the fact that the TRUNCATE occurred. In contrast with popular belief, a TRUNCATE statement can participate in a transaction.
     
  • Use the SWITCH statement and partitioning. If partitions need to be moved around, you can use the SWITCH statement (to switch in a new partition or switch out the oldest partition), which is a minimally logged statement.
     
  • Be careful when using DML statements; if you mix in DML statements within your INSERT statements, minimum logging is suppressed. 

10 34x34 Top 10 SQL Server Integration Services Best Practices

Schedule and distribute it correctly.

After your problem has been chunked into manageable sizes, you must consider where and when these chunks should be executed. The goal is to avoid one long running task dominating the total time of the ETL flow.

A good way to handle execution is to create a priority queue for your package and then execute multiple instances of the same package (with different partition parameter values). The queue can simply be a SQL Server table. Each package should include a simple loop in the control flow:

  1.  Pick a relevant chunk from the queue:
    1. “Relevant” means that is has not already been processed and that all chunks it depends on have already run.
    2. If no item is returned from the queue, exit the package.
  2. Perform the work required on the chunk.
  3. Mark the chunk as “done” in the queue.
  4. Return to the start of loop.

Picking an item from the queue and marking it as “done” (step 1 and 3 above) can be implemented as stored procedure, for example.

The queue acts as a central control and coordination mechanism, determining the order of execution and ensuring that no two packages work on the same chunk of data. Once you have the queue in place, you can simply start multiple copies of DTEXEC to increase parallelism.

  • Share/Bookmark

.NET Reactive(Rx) Framework

17 November 2009 | 1 Comment »

Reactive Programming is a new buzz word in the Microsoft world when late this summer they release a Framework name “Reactive Framework” in the latest version of Silver Light toolkit (System.Reactive.dll) from Microsoft. A complete version is expected to be part of Visual Studio 2010 and will be supported by .NET Framework 4.

But before dwelling deep into the reactive programming lets see what does reactive programming means. It basically means how the producer be able to publish data both static and dynamic and how the consumer be able to view the data seamlessly and automatically.

For example, in an imperative programming setting, a: = b + c would mean that a is being assigned the result of b + c in the instant the expression is evaluated.

Later, the values of b and c can be changed with no effect on the value of a. In reactive programming, the value of a would be automatically updated based on the new values.

A modern spreadsheet program is an example of reactive programming. Spreadsheet cells can contain literal values, or formulas such as “=B1+C1″ that are evaluated based on other cells. Whenever the value of the other cells change, the value of the formula is automatically updated.

Similarly another easy concept in microsoft programming world is the event handler or constantly running services which publishes the data constantly and we write a program to handle that event to show data.

Coming back to the “Reactive Framework” or “Rx Framework” it is based on the same principal discussed above. With the advance of the asynchronous programming reactive framework definitely comes to help.

 

This team has been able to implement the dual mode of communication channel of push – pull or publisher-subscriber model using the following interfaces.

  • The Iterator pattern of IEnumerable/IEnumerator to pull\subscribe to the data used by the listner
  • Iterator pattern of IObservable/IObserver for push\publish the data used by the source

 

To dwell deep into the Reactive framework have a look at the video from Eric Meijer explaining this concept.

 

Lets take a quickly look at how we can use Rx to simplify our code, we will write an extension method that will call a web client’s DownloadWeatherData method and return a String.

public static class WebClientExtender
{

public static IObservable<string> GetSite(this WebClient client, string url)
{
var downloaded = Observable.FromEvent<DownloadWeatherDataCompletedEventArgs>
                 (client, ” DownloadWeatherData”);
client.DownloadWeatherDataAsync(new Uri(url));
return downloaded.Select(x=>((DownloadWeatherDataCompletedEventArgs)x.EventArgs).Result);
}

}

To call this we can now simple add the following line of code to our Silverlight application.

new WebClient().GetSite(“http://localhost/”).Subscribe(x => MessageBox.Show(x));

This way we can very easily clean up the code and structure how the Asynchronous events are handled.

 

To have a look at some of the programming concepts of it have a look at the blog introducing reactive (Rx) Linq to Events

  • Share/Bookmark

OAuth and .NET

13 November 2009 | 2 Comments »

OAuth is a new wave in the website security protocol or to be precise an API access delegation protocol’. OAuth allows a client application to obtain user consent (as access tokens) for executing operations over private resources on his behalf.

 OAuth allows you to share your private resources (photos, videos, contact list, bank accounts) stored on one site with another site without having to hand out your username and password. There are many reasons why one should not share their private credentials. Giving your email account password to a social network  site so they can look up your friends is the same thing as going to dinner and giving your ATM card and PIN code to the waiter when it’s time to pay. Any restaurant asking for your PIN code will go out of business, but when it comes to the web, users put themselves at risk sharing the same private information. OAuth here comes to the rescue.

If you want to know more about how OAuth works, you should read the following posts

Now, if we analyze the specification in more detail, we will see that the real purpose behind OAuth is to create a network of collaboration between applications. It will not be necessary anymore to keep all our stuff just in a single place, we can have for instance our pictures in a website, our contacts in another place and a third application making use of them, all these applications collaborating together.

Currently we hear OAuth being mostly associated with the social networking sites like Twitter, yahoo, google etc. However this is going to change in future, I see it being implemented in the cloud computing environment to provide more seamless access. Google has released its OpenID/OAuth implementation. This is a major step forward in the Interoperability field. The work that Google has released is very important and it will allow, for instance, that a user from Zoho Writer can use data from a Google Docs Spreadsheet and then make the result available in his Linkedin profile.

 Similarly I think with Microsoft releasing its new cloud computing platform Azure. The OAuth definitely comes into play more so important than ever before.

 Some of the OAuth .NET Faremwork Library available are :

  Here is some OAuth Implementation examples in .NET

  • Share/Bookmark

ORM in .NET

13 November 2009 | No Comments »

 

Introduction

ORM, or object-relational mapping, is one of the tougher things to accomplish in modern, object-oriented programming languages. It involves moving away from the traditional data store paradigm: there is no (or very little) dedicated, pre-compiled code involved in reading/writing an object to/from the database or other backing store. Instead, the logic involved in accessing the backing store is determined at runtime using a combination of reflection and attributes that decorate the business objects in question. Many projects and frameworks have been created to try to address this concept, with varying degrees of success. What this article covers is a general introduction to ORM concepts, the approach that .NET 3.5 takes.

In the beginning…

Prior to .NET 3.5, you had several choices when it came to getting your business objects to and from the database:

  1. Roll your own – This means you don’t use any frameworks and don’t auto-generate any code. The database schema and the .NET classes are created by hand, as is the data access layer. While this will provide the ultimate level of customizability and performance, it’s tedious (involves copying a lot of boilerplate code), error prone, and difficult to maintain when the objects or the database schemas change.
  2. Auto-generate the classes and the data access layer – This is where code generation tools like CodeSmith or MyGeneration come in: you point them at your database and it will generate the .NET classes and the data access layer. Like option 1, this isn’t true ORM: you still have pre-compiled code responsible for accessing the database to read or write an object’s data. However, its automatic generation of the code is a step in the right direction, removing the error-prone human factor when creating the classes and the data access layer.
  3. Use a true ORM framework – There are several well-known ORM packages available for previous versions of the .NET framework, including NHibernate and Gentle.NET. As mentioned previously, ORM removes the dedicated data store code and inspects an object at runtime to determine what it needs to do to read/write it to/from the database. Attributes are used to decorate the class and its properties to give the framework pointers about where things go in the database. The actual SQL for an operation is generated dynamically based on these attributes. There is often a code-generation component in these packages that generates the .NET business object classes from the database schema, but no dedicated data access code is generated

Some Problems with ORM

So all this dynamic, runtime SQL generation stuff sounds great, right? Not so fast: ORM has several serious drawbacks. The first of these is performance, as you’re going to encounter a slowdown any time you bring reflection into the equation and start dynamically generating SQL. ORM will never be as fast as rolling your own: there’s no substitute to being able to hand-tweak your stored procedures and pre-compile all of the data access logic. Another drawback is that ORM doesn’t deal well with extremely complex databases. When designing complex databases with a lot of constraints and relationships spanning several tables, it’s often necessary to include intermediary tables to link various entities together that is great from a RDBMS standpoint, but doesn’t translate all that well to an object-oriented environment. This can lead to obtuse and difficult to understand auto-generated classes. Keep in mind that RDBMS and object-oriented environments are fundamentally different, and each includes its own set of design and performance considerations. What works in one environment is not necessarily optimal for the other environment. That being said, the upside to ORM in terms of maintainable, clean, and easy to understand code can be quite compelling, provided that it’s used correctly.

ADO.NET Entity Framework in .NET 3.5

So, now that you have a good idea of what ORM is all about and its potential pitfalls, let’s delve into how Microsoft approached this concept in .NET 3.5. It takes a different approach to the challenge of ORM by not focusing on slaving the object model to a relational model, but by instead giving us an entirely new way to access and query our data that’s not limited only to relational data. With this approach, the ORM capabilities of .NET 3.5 evolve almost as a side-effect instead of being the prime focus of this new data access scheme. 
 

The ADO.NET Entity Framework is designed to enable developers to create data access applications by programming against a conceptual application model instead of programming directly against a relational storage schema. The goal is to decrease the amount of code and maintenance required for data-oriented applications. Entity Framework applications provide the following benefits:

  • Applications can work in terms of a more application-centric conceptual model, including types with inheritance, complex members, and relationships.
  • Applications are freed from hard-coded dependencies on a particular data engine or storage schema.
  • Mappings between the conceptual model and the storage-specific schema can change without changing the application code.
  • Developers can work with a consistent application object model that can be mapped to various storage schemas, possibly implemented in different database management systems.
  • Multiple conceptual models can be mapped to a single storage schema.
  • Language-integrated query (LINQ) support provides compile-time syntax validation for queries against a conceptual model

LINQ in .NET 3.5

How do they do it? LINQ. It stands for Language INtegrated Query and Microsoft wants it to be THE way that you sift through data in the .NET framework. Its structure will be immediately familiar to anyone with experience writing SQL statements and it marries the simple yet powerful query syntax of SQL to the strong typing of an object-oriented language. The real kicker, however, is that it’s not limited to relational data: anything that implements the IEnumerable and IQueryable interfaces can be used with LINQ. Here’s a quick example so that you can get an idea of what it’s capable of:

C# Code
01.List<string> elements = new List<string>() 
02.
03.  "Iridium"
04.  "Einsteinium"
05.  "Polonium" 
06.};
07.IEnumerable<string> results = from element in elements
08.                  where element.Contains("n")
09.                  select element;

It’s just a simple search of a generic string list instance for elements that contain the letter “n”. Whereas before you would have had to accomplish this imperatively, that is, you would have had to write code to iterate over the collection and drive the search, you can now accomplish the same thing declaratively. Basically, you’re stating what you want to do instead of how to do it. While the syntax takes some getting used to, this approach is inherently less error-prone. It also bears mentioning that Intellisense is in full effect in the above sample, so you lose none of the “ease of use” features of Visual Studio when you employ LINQ.

  • Share/Bookmark

.NET goes open source and cross platform with Mono

10 November 2009 | No Comments »

In the last blog i talked about MonoDevelop an open source cross platform IDE for .NET development. This cross platform .NET development was only possible due to the Mono Framework.

Mono is a software platform designed to allow developers to easily create cross platform applications. It is an open source implementation of Microsoft’s .Net Framework based on the ECMA standards for C# and the Common Language Runtime. We feel that by embracing a successful, standardized software platform, we can lower the barriers to producing great applications for Linux.

The Components

There are several components that make up Mono:

C# Compiler – The C# compiler is feature complete for compiling C# 1.0 and 2.0 (ECMA), and also contains many of the C# 3.0 features.

Mono Runtime – The runtime implements the ECMA Common Language Infrastructure (CLI). The runtime provides a Just-in-Time (JIT) compiler, an Ahead-of-Time compiler (AOT), a library loader, the garbage collector, a threading system and interoperability functionality.

Base Class Library – The Mono platform provides a comprehensive set of classes that provide a solid foundation to build applications on. These classes are compatible with Microsoft’s .Net Framework classes.

Mono Class Library – Mono also provides many classes that go above and beyond the Base Class Library provided by Microsoft. These provide additional functionality that are useful, especially in building Linux applications. Some examples are classes for Gtk+, Zip files, LDAP, OpenGL, Cairo, POSIX, etc.

The Benefits

There are many benefits to choosing Mono for application development:

Popularity – Built on the success of .Net, there are millions of developers that have experience building applications in C#. There are also tens of thousands of books, websites, tutorials, and example source code to help with any imaginable problem.

Higher-Level Programming – All Mono languages benefit from many features of the runtime, like automatic memory management, reflection, generics, and threading. These features allow you to concentrate on writing your application instead of writing system infrastructure code.

Base Class Library – Having a comprehensive class library provides thousands of built in classes to increase productivity. Need socket code or a hashtable? There’s no need to write your own as it’s built into the platform.

Cross Platform – Mono is built to be cross platform. Mono runs on Linux, Microsoft Windows, Mac OS X, BSD, and Sun Solaris, Nintendo Wii, Sony PlayStation 3, Apple iPhone. It also runs on x86, x86-64, IA64, PowerPC, SPARC (32), ARM, Alpha, s390, s390x (32 and 64 bits) and more. Developing your application with Mono allows you to run on nearly any computer in existance (details).

Common Language Runtime (CLR) – The CLR allows you to choose the programming language you like best to work with, and it can interoperate with code written in any other CLR language. For example, you can write a class in C#, inherit from it in VB.Net, and use it in Eiffel. You can choose to write code in Mono in a variety of programming languages.

–courtesy mono project

  • Share/Bookmark

MonoDevelop opens up Mac for .NET development

7 November 2009 | 1 Comment »

As I said that I am a technology evangelist. I like new technology in .NET as well as open source. Other thing which I liked was Mac. But before MonoDevelop both were two different worlds. You could not develop a .NET application on a Mac OS X. MonoDevelop has solved most of my problem or you can say it is the new bridge between different platforms.

MonoDevelop is an opensource Integrated development environment for Linux platform, Mac OSX and Windows(to be supported in future). It allows you to develop software targeted to Mono and .NET framework. This IDE has feature like intellisense, source control integration and an integrated GUI and Web designer

MonoDevelop has recently launched the latest version of the IDE. To read more about it http://monodevelop.com/Download/MonoDevelop_2.0_Released.

If you’ve worked with Microsoft Visual Studio, you will see many similarities in MonoDevelop and will feel quite comfortable in the Mono environment. If you’re new to MonoDevelop and haven’t worked in Visual Studio, you’ll find that the learning curve is not very steep.

A new competitor for Visual Studio IDE…. eeh…lets see!!

  • Share/Bookmark