Hive中rlike,like,not like,regexp区别与使用详解

1.like的使用详解

1.语法规则:

  1. 格式是A like B,其中A是字符串,B是表达式,表示能否用B去完全匹配A的内容,换句话说能否用B这个表达式去表示A的全部内容,注意这个和rlike是有区别的。返回的结果是True/False.
  2. B只能使用简单匹配符号 _和%,”_”表示任意单个字符,字符”%”表示任意数量的字符
  3. like的匹配是按字符逐一匹配的,使用B从A的第一个字符开始匹配,所以即使有一个字符不同都不行。

2.操作类型: strings
3.使用描述: 如果字符串A或者字符串B为NULL,则返回NULL;如果字符串A符合表达式B  的正则语法,则为TRUE;否则为FALSE。尤其注意NULL值的匹配,返回的结果不是FALSE和TRUE,而是null,其实除了is null ,is not null,其他的关系运算符只要碰到null值出现,结果都是返回NULL,而不是TRUE/FALSE。

hive (default)> select  'abcde'  like 'abc';
OK
false
hive (default)> select  null   like '%';
OK
NULL
hive (default)> select  'abc'   like null ;
OK
NULL

4.案例演示

'foobar' LIKE 'foo'的值为FALSE,而'foobar' LIKE 'foo___'的值为TRUE, 'foobar' LIKE 'foo%'的值为TRUE。要转义%,请使用\(%匹配一个%字符)。如果数据包含分号,你想匹配它,则需要转义,像'a\;b'


hive (default)> select  'abcde'  like 'abc';
OK
false
hive (default)> select  'abcde'  like 'abc__';
OK
true
hive (default)> select  'abcde'  like 'abc%';
OK
true
hive (default)> select  'abcde'  like '%abc%';
OK
true
hive (default)> select  'abcde'  like 'bc%';
OK
false
hive (default)> select  'abcde'  like '_bc%';
OK
true
hive (default)> select  'abcde'  like '_b%';
OK
true

5.注意事项:否定比较时候用NOT A LIKE B(使用A NOT LIIKE B也可以),结果与like的结果时相对的。当然前提要排除出现null问题,null值这个奇葩除外。

hive (default)> select  'abcde'  like 'abc';
OK
false
hive (default)> select  not   'abcde'  like 'abc';
OK
true
hive (default)> select    'abcde' not   like 'abc';
OK
true
hive (default)> select null like '%';
OK
NULL
hive (default)> select not  null like '%';
OK
NULL
hive (default)> select   null not  like '%';
OK
NULL

2. RLIKE比较符使用详解

1.语法规则:

  1. A RLIKE B ,表示B是否在A里面即可。而A LIKE B,则表示B是否是A.
  2. B中的表达式可以使用JAVA中全部正则表达式,具体正则规则参考java,或者其他标准正则语法。

2.操作类型: strings
3.使用描述: 如果字符串A或者字符串B为NULL,则返回NULL;如果字符串A符合JAVA正则表达式B的正则语法,则为TRUE;否则为FALSE。

hive (default)> select 'footbar' rlike '^f..]+r$';
OK
false
hive (default)> select 'footbar' rlike '^f.*r$';
OK
true
hive (default)> select 'foobar' rlike 'foo';  --注意同样表达式,用正则匹配成功
OK
true
hive (default)> select 'foobar' like 'foo';--注意同样表达式,用like匹配失败
OK
false
hive (default)> select '123456' rlike '^\\d+$';
OK
true
hive (default)> select null rlike '.*';
OK
NULL

3.NOT A LIKE B 与 A not like B

  1. not..like是like的否定用法,如果like匹配结果时true,则not..like的匹配结果时false,反之也是结果也是相对。当然前提要排除出现null问题,null值这个奇葩除外,null的结果都是null值。

hive> select 1 from t_fin_demo  where NOT 'football' like 'fff%';
        1
hive>select 1 from t_fin_demo where 'football' not  like 'fff%';
        1
hive> select 1 from t_fin_demo where 'football'  like 'fff%';

4.关于like与rlike,not like,like not的使用对比总结

   1.Rlike功能和like功能大致一样,like是后面只支持简单表达式匹配(_%),而rlike则支持标准正则表达式语法。所以如果正则表达式使用熟练的话,建议使用rlike,功能更加强大。所有的like匹配都可以被替换成rlike。反之,则不行。但是注意:like是从头逐一字符匹配的,是全部匹配,但是rlike则不是,可以从任意部位匹配,而且不是全部匹配。

hive (default)> select 'foobar' like 'foo';
OK
false
hive (default)> select 'foobar' like 'foo';
OK
false
hive (default)> select 'foobar' like 'oo%';
OK
false
hive (default)> select 'foobar' rlike 'foo';
OK
true
hive (default)> select 'foobar' rlike '.oo.*';
OK
true

   2. NOT A LIKE B是LIKE的结果否定,如果like匹配结果时true,则not..like的匹配结果时false,反之也是结果也是相对。实际中也可以使用 A NOT LIKE B,也是LIKE的否定,与 NOT A LIKE B一样。当然前提要排除出现null问题,null值这个奇葩除外,null的结果都是null值。

   3.同理NOT RLIKE 的使用,也是NOT  A  RLIKE  B是对RLIKE的否定。当然前提要排除出现null问题,null值这个奇葩除外,null的结果都是null值。

5.regexp的用法和rlike一样

已标记关键词 清除标记
<div class="post-text" itemprop="text"> <p>I need to figure out how to query my database for <em>Serial A</em> or <em>Serial B</em> or <em>Serial C</em>, however there are a variety of constraints making this difficult to navigate. </p> <p>The previous solution used <code>LIKE</code> (e.g. <code>SELECT * FROM Table WHERE Serial LIKE</code>) - this worked well for searching for all serials (<code>%</code>) or a specific serial (<code>Serial A</code>). However, I quickly found that <code>IN</code> and <code>LIKE</code> cannot be combined in MySql, and that the wildcards used by <code>LIKE</code> are not flexible enough to find <em>Serial A</em> or <em>Serial B</em> or <em>Serial C</em>. </p> <p><em>(Note: The actual serials have a great deal of variation and no consistent convention - I'm aware that serials as simple as these basic examples could be found using the <code>%</code> wildcard)</em> </p> <p>My immediate thought was to use <code>RLIKE</code> instead - however some of the serials are named things such as <code>B + 3</code>, and cannot be found without escaping. Here, I thought to use a method to escape the characters, such as <code>preg_quote()</code> - however therein lies more problems with the structure of the system... </p> <hr> <p><strong>Select Constraints Module</strong> </p> <p>A combobox is populated by querying the database (e.g. <code>SELECT DISTINCT Serial FROM Table</code>) from which the users can select a serial, and then submit the form. The selected serial is passed as a <code>GET</code> value to the next page. </p> <p><strong>View Results Module</strong> </p> <p>The contents of a <code>$serial</code> variable are passed as a parameter to a SQL query to generate results on the page. <code>$serial</code> is either populated by the page the View Results Module is part of, or by the <code>GET</code> value passed to the page (if the value hasn't been populated by the page before the module is <code>include</code>d and there is no <code>GET</code> value, an error will be thrown). </p> <hr> <p>Its this structure that's confusing me a little. It seems like it would be pretty straightforward to use <code>RLIKE</code> and <code>preg_quote()</code> to search for a serial if this was passed by the user. <code>preg_quote()</code> can be used on <code>$serial</code> before it is passed as a parameter and the updated (<code>WHERE Serial RLIKE</code>) query should work as expected. However, I still can't figure out how to search for <em>more than one</em> serial. </p> <p>If I specify <code>$serial</code> as <code>(Serial A | Serial B | Serial C)</code> then <code>preg_quote()</code> will escape the regex characters and not find them. But if I don't use <code>preg_quote()</code>, serials such as <code>B + 3</code> will not be found because regex characters are left unescaped. </p> <hr> <p>I'd very much appreciate any feedback on improving this question - I've tried to explain it as clearly as possible but the complexity of the context and constraints make it difficult. Thank you. </p> </div>
©️2020 CSDN 皮肤主题: Age of Ai 设计师:meimeiellie 返回首页