Problem
You need to generate random data directly into SQL Server table columns or close to the database engine as variables or expressions. Looking at the SQL Server available functions, you notice that only RAND function offers support for random data generation. Although RAND([seed]) is a built-in function, it can only return a float value between 0 and 1, and has other limitations in regards to seed values. Because your table columns may be of various data types, and each data type may have a lower value and an upper value, you would prefer to create your custom random data generators. This is when SQL Server CLR functions come into play and provide a viable solution.
Solution
.NET Base Class Library provides the System.Random class which is a pseudo-random number generator, producing a sequence of numbers meeting certain requirements for randomness. The results provided by this class' methods are excellent candidates for our purposes, as they are fast obtained, and the duplicate values produced for a large number of usages, may be also good for representing column data within constraints. One of the constructors of the Random class also takes a seed as a parameter, where the seed generation has lots of flexibility. Listed below are some of the constructors (2 in total) and the methods (12 in total) in the Random class, which will be used in the code samples:
- Random(Int32): initializes a new instance of the Random class, using the specified seed value
- Next(Int32, Int32): returns a random number (Int32) within the specified range
- NextDouble: returns a random number (Double) between 0.0 and 1.0
Describing how these pseudo-random numbers are chosen with equal probability from a finite set of numbers, and the theory behind how the seed value influences the algorithm, is beyond the scope of this article. Here we will use a seed value that is time-dependent, and this is sufficient for all practical purposes.
SQL Server data types are organized into 7 categories - like Exact numerics, Date and time - each containing several groups - like float and real groups in the Approximate numerics category. In the tip presented here we will show the code necessary to develop SQL CLR C# functions that can be used to return values in the int group in the Exact numerics category, as well as values in the float group in the Approximate numerics category.
Notice how the seed value is initialized using the current millisecond, and how NULL input parameters are replaced with their minimum - maximum .NET representation for int or double data type.
And now our functions become pretty straightforward, with the code sample below having comments to improve clarity:
using Microsoft.SqlServer.Server;
using System;
using System.Data.SqlTypes;
public class RandomFunctions
{
[SqlFunction]
public static SqlInt32 RandomInt(SqlInt32 minValue,
SqlInt32 maxValue)
{
// Returns a random int between minValue and maxValue
try
{
// If minValue or maxValue is null
// replace it with an appropriate value
if (minValue.IsNull)
// A constant representing the smallest possible value of a SqlInt32
minValue = SqlInt32.MinValue;
if (maxValue.IsNull)
// A constant representing the largest possible value of a SqlInt32
maxValue = SqlInt32.MaxValue;
// Check if the size of input parameters is properly specified
if (minValue > maxValue)
return SqlInt32.Null;
else if (minValue == maxValue)
return minValue;
else
{
// Return the random int within the minValue, maxValue range
return (new Random(DateTime.Now.Millisecond)).Next(minValue.Value, maxValue.Value);
}
}
catch (Exception ex)
{
// On any error, return NULL
return SqlInt32.Null;
}
}
[SqlFunction]
public static SqlDouble RandomFloat(SqlDouble minValue,
SqlDouble maxValue)
{
try
{
// Returns a random float between minValue and maxValue
if (minValue.IsNull)
// A constant representing the smallest possible value of a SqlDouble
minValue = SqlDouble.MinValue;
if (maxValue.IsNull)
// A constant representing the largest possible value of a SqlDouble
maxValue = SqlDouble.MaxValue;
// Check if the size of input parameters is properly specified
if (minValue > maxValue)
return SqlDouble.Null;
else if (minValue == maxValue)
return minValue;
else
{
var rnd = new Random(DateTime.Now.Millisecond);
// Get a random double between 0.0 and 1.0.
var rand = rnd.NextDouble();
// Return a random double the minValue, maxValue range
return (1 - rand) * minValue.Value + rand * maxValue.Value;
}
}
catch (Exception ex)
{
// On any error, return NULL
return SqlDouble.Null;
}
}
};
A sample SQL Server run is shown in the script below:
------------------------------------------------------------
-- RandomInt function
------------------------------------------------------------
--
select dbo.RandomInt(null, null)
--possible returns
-- -300396732
-- 726643699
--
select dbo.RandomInt(11, 9)
--returns NULL because minValue (11) is > maxValue (9)
--
select dbo.RandomInt(8, 8)
--returns 8 because minValue (8) is = maxValue (8)
--
select dbo.RandomInt(-22, 4)
--possible returns
-- -13
-- -3
--
select dbo.RandomInt(3667, 80956)
--possible return
-- 63587
--
select dbo.RandomInt(null, 0)
--possible return
-- -1746587208
--
------------------------------------------------------------
-- RandomFloat function
------------------------------------------------------------
--
select dbo.RandomFloat(null, null)
--possible returns
-- 1.59596487705279E+308
-- -4.64457392666594E+307
--
select dbo.RandomFloat(11.22, 9.99)
--returns NULL because minValue (11.22) is > maxValue (9.99)
--
select dbo.RandomFloat(8.88, 8.880)
--returns 8.88 because minValue (8.88) is = maxValue (8.880)
--
select dbo.RandomFloat(-22.333, 4.5)
--possible returns
-- -4.82918367984667
-- 1.67866188385369
--
select dbo.RandomFloat(3666.7, 80955.6785)
--possible return
-- 56206.2634003703
--
select dbo.RandomFloat(0, null)
--possible return
-- 5.68041671739381E+307
--
------------------------------------------------------------
The advantages of implementing the basic numeric random data generation functions are straightforward as they can be used inline of SQL statements, as well as to fill in T-SQL variables or computed table columns within your SQL Server database environment.
Random data generation using .NET methods provide a very powerful functionality that every Developer/DBA should attempt to implement if requirements arise.
Our tip discussed here just gets you started writing your own SQL CLR functions embedding this attractive alternative in your data generation solutions.
Next Steps
- Compile, deploy, and use the RandomInt and RandomFloat functions; enhance them to suit your needs.
- Explore uses of other data types as input / output for SQL CLR random functions.
- Use complex mathematics in your SQL Server CLR development toolkit.
No comments:
Post a Comment