1 2 3 4 5 6 7 8 | 1 scala> for (i <- 0 until 2) println("Hello") //<2 Hello Hello 2 scala> for (i <- 0 to 2) println("Hello") //<=2 Hello Hello Hello |
1 2 3 4 | 3 def add(more:Int) = (x:Int)=> x+more val int1 = add(1) // init more = 1 int1(10) // will call function x+1 = 11 |
1 2 3 4 5 6 7 8 | 4 def echo(args :String*) = for(arg->args) println(arg) val arr = Array("what's", "up", "doc?") echo(arr:_*) what's up doc? |
1 2 3 4 5 6 7 8 9 10 11 | 5 高阶函数:higher-order-function 带其他函数做参数的函数 def filesMatching(query:String, matcher:(String, String)=> Boolean) ={ // matcher:(String, String)=> Boolean for(file <- filesHere; if matcher(file.getName, query)) yield file // matcher(file.getName, query) } def filesEnding(query:String) = filesMatching(query, _.endWith(_)) def filesContaining(query:String) = filesMatching(query, _.contains(_)) def filesRegex(query:String) = filesMatching(query, _.matches(_)) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | 6 Curry化 柯里化声称“如果你固定某些参数,你将得到接受余下参数的一个函数”。所以对于有两个变量的函数y^x,如果固定了 y=2,则得到有一个变量的函数 2^x def first(x:Int) = (y:Int) => x+y first: (Int)(Int) => Int def twice(op: Int => Int, x:Int) = op(op(x)) //高阶函数 twice(_ + 1, 5) === op(x) = x + 1 | op(op(x)) === |
]]>貸出模式 Loan Pattern 该模式大概说: 对于那些资源密集型(resource-intensive)对象应使用这一模式 既然是资源密集在一个对象中,那么用户代码就不能一直保持获得所有资源,而应该在需要时就向资源供给方进行借贷,使用完后应立即归还。
脆基类 派生类对基类有十分紧密的耦合,同时这个闭合的连接是不受欢迎的。设计师们因此给这种行为起了一个绰号——“脆基类问题”。基类是被认为十分脆弱的,因为你可以通过一个表面上十分安全的方法修改一个基类,但这个新的行为,当被派生类继承的时候,可能会造成派生类运行出错。你不能简单孤立地通过检查基类的方法来判断你对基类的改变是不是安全;你也必须查看(并测试)所有的派生类。此外,你必须检查所有+同时使用了基类和派生类对象的代码,因为这些代码可能会被新的行为所破坏。对关键的基类的小小的改变都会导致整个程序无法运行。&oq=派生类对基类有十分紧密的耦合,同时这个闭合的连接是不受欢迎的。设计师们因此给这种行为起了一个绰号——“脆基类问题”。基类是被认为十分脆弱的,因为你可以通过一个表面上十分安全的方法修改一个基类,但这个新的行为,当被派生类继承的时候,可能会造成派生类运行出错。你不能简单孤立地通过检查基类的方法来判断你对基类的改变是不是安全;你也必须查看(并测试)所有的派生类。此外,你必须检查所有+同时使用了基类和派生类对象的代码,因为这些代码可能会被新的行为所破坏。对关键的基类的小小的改变都会导致整个程序无法运行。
object看成是特殊的class,可以认为他是class的单例
1 2 3 4 5 6 7=== Any | class | Nothing/Null ===
Scala类层次结构
trait 有点抽象接口的意思
UI+DataBase的两层架构,这种面向数据库的架构(上图table module )没有灵活性。
UI+Service+DataBase的多层SOA架构,这种服务+表模型的架构易使服务变得囊肿,难于维护拓展,伸缩性能差,见这里讨论或Spring Web 应用的最大败笔.The Biggest Flaw of Spring Web Applications | Java 真是败笔吗? I do not think so.
DDD+SOA的事件驱动的CQRS读写分离架构,应付复杂业务逻辑,以聚合模型替代数据表模型,以并发的事件驱动替代串联的消息驱动。真正实现以业务实体为核心的灵活拓展。
失血模型 失血模型简单来说,就是domain object只有属性的getter/setter方法的纯数据类,所有的业务逻辑完全由business object来完成(又称 TransactionScript),这种模型下的domain object被Martin Fowler称之为“贫血的domain object”。
贫血模型 简单来说,就是domain ojbect包含了不依赖于持久化的领域逻辑,而那些依赖持久化的领域逻辑被分离到Service层。 Service(业务逻辑,事务封装) --> DAO ---> domain object
1 2 3 4 5 6 | 这种模型的优点: 1、各层单向依赖,结构清楚,易于实现和维护 2、设计简单易行,底层模型非常稳定 这种模型的缺点: 1、domain object的部分比较紧密依赖的持久化domain logic被分离到Service层,显得不够OO 2、Service层过于厚重 |
充血模型 充血模型和第二种模型差不多,所不同的就是如何划分业务逻辑,即认为,绝大多业务逻辑都应该被放在domain object里面(包括持久化逻辑) ,而Service层应该是很薄的一层,仅仅封装事务和少量逻辑,不和DAO层打交道。 Service(事务封装) ---> domain object <---> DAO
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | 这种模型的优点: 1、更加符合OO的原则 2、Service层很薄,只充当Facade的角色,不和DAO打交道。 这种模型的缺点: 1、DAO和domain object形成了双向依赖,复杂的双向依赖会导致很多潜在的问题。 2、如何划分Service层逻辑和domain层逻辑是非常含混的,在实际项目中,由于设计和开发人员的水平差异,可能导致整个结构的混乱无序。 3、考虑到Service层的事务封装特性,Service层必须对所有的domain object的逻辑提供相应的事务封装方法,其结果就是Service完全重定义 一遍所有的domain logic,非常烦琐,而且Service的事务化封装其意义就等于把OO的domain logic转换为过程的Service TransactionScript 。该充血模型辛辛苦苦在domain层实现的OO在Service层又变成了过程式,对于Web层程序员的角度来看,和贫血模型没有什么区别了。 1.事务我是不希望由Item管理的,而是由容器或更高一层的业务类来管理。 2.如果Item不脱离持久层的管理,如JDO的pm,那么itemDao.update(this); 是不需要的,也就是说Item是在事务过程中从数据库拿出来的,并 且声明周期不超出当前事务的范围。 3.如果Item是脱离持久层,也就是在Item的生命周期超出了事务的范围,那就要必须显示调用update或attach之类的持久化方法的,这种时候 就应该是按robbin所说的第2种模型来做。 |
胀血模型 基于充血模型的第三个缺点,有同学提出,干脆取消Service层,只剩下domain object和DAO两层,在domain object的domain logic上面封装 事务。 domain object(事务封装,业务逻辑) <---> DAO 似乎ruby on rails就是这种模型,他甚至把domain object和DAO都合并了。 该模型优点: 1、简化了分层 2、也算符合OO 该模型缺点: 1、很多不是domain logic的service逻辑也被强行放入domain object ,引起了domain ojbect模型的不稳定 2、domain object暴露给web层过多的信息,可能引起意想不到的副作用。
个人认为 失血是Hibernate的流行引起的,数据逻辑被分在Dao/imple 层实现 贫血恰恰就是Spring frame的经典MVC,现在在web开发中还是很好的,不够OO?这都不是个事。 充血、胀血没有概念,不过没有看出明显的好处,反而感觉这样做的话,逻辑会很混乱,对于指令编程体系来说,这可不是个好消息。 So,首选还是贫血。至少现在是这么认为的。
DDD 属于充血模型
DDD最大的好处是:接触到需求第一步就是考虑领域模型,而不是将其切割成数据和行为,然后数据用数据库实现,行为使用服务实现,最后造成需求的首肢分离。DDD让你首先考虑的是业务语言,而不是数据。重点不同导致编程世界观不同。
DDD在软件生产流程中定位i如下图,DDD落地实现离不开in-memory缓存、 CQRS、 DCI、 EDA或Event Source几大大相关领域。
外圈的层次可以依赖内层,反之不可以,内圈核心的实体代表业务,不可以依赖其所处的技术环境
DCI (http://www.jdon.com/37976) DCI是数据Data 场景Context 交互Interactions的简称,DCI是一种特别关注行为的模式(可以对应GoF行为模式),而MVC模式是一种结构性模式,MVC模式由于结构化,而可能忽视了行为事件。DCI Architecture是将“是什么”和“做什么”进行分离,然后根据需求在不同场景动态结合,还是桥模式的味道。
CQRS: 命令查询的责任分离Command Query Responsibility Segregation (简称CQRS)模式是一种架构体系模式,能够使改变模型的状态的命令和模型状态的查询实现分离。这属于DDD应用领域的一个模式,主要解决DDD在数据库报表输出上处理方式。
允许应用程序都是由用户,程序,自动化测试或批处理脚本驱动的,在事件驱动和数据库环境下被开发和隔离测试。一个事件从外面世界到达一个端口,特定技术的适配器将其转换成可用的程序调用或消息,并将其传递给应用程序。该应用程序是可以无需了解输入设备的性质(调用者是哪个)。当应用程序有结果需要发出时,它会通过一个端口适配器发送它,这个适配器会创建接收技术(人类或自动)所需的相应信号。该应用程序与在它各方面的适配器形成语义良性互动,但是实际上不知道适配器的另一端的谁在处理任务。
]]>心得: 看的头痛!!!
SVG允许三种类型的图形对象:
- 矢量图形形状(例如由直线和曲线组成的路径)、
- 图像
- 文本。
可以将图形对象(包括文本)分组、样式化、转换和组合到以前呈现的对象中。 SVG 功能集包括嵌套转换、剪切路径、alpha 蒙板和模板对象。 SVG绘图是交互式和动态的。 例如,可使用脚本来定义和触发动画。这一点与Flash相比很强大。Flash是二进制文件,动态创建和修改都比较困难。而SVG是文本文件,动态操作是相当容易的。而且,SVG直接提供了完成动画的相关元素,操作起来非常方便。 SVG VS Canvas SVG与其他Web标准兼容,并直接支持文档对象模型DOM。这一点也是与HTML5中的canvas相比很强大的地方。 这里注意, SVG内部也是用一个类似的canvas这样的东西来展示SVG图形, 到后面你会发现很多特性和HTML5的canvas还有点像;文中如果没明确说明是SVG的canvas的话,都代指HTML5中的canvas元素。因而,可以很方便的使用脚本实现SVG的很多高级应用。而且SVG的图形元素基本上都支持DOM中的标准事件。可将大量事件处理程序(如“onmouseover”和“onclick”)分配给任何SVG图形对象。 虽然SVG的渲染速度比不上canvas元素,但是胜在DOM操作很灵活,这个优势完全可以弥补速度上的劣势。
特点 • SVG文件是纯粹的XML, 可被非常多的工具读取和修改(比如记事本)。 • SVG 与JPEG 和GIF图像比起来,尺寸更小,且可压缩性更强。 • SVG 是可伸缩的,可在图像质量不下降的情况下被放大,可在任何的分辨率下被高质量地打印。 • SVG 图像中的文本是可选的,同时也是可搜索的(很适合制作地图)。 • SVG 可以与 Java 技术一起运行。 • SVG 是开放的标准。
Browser IE9+, 可以在HTML 中直接运行
渲染顺序 从前到后
SVG是以XML定义的,所以是大小写敏感的,这点与HTML不一样.
http://www.cnblogs.com/dxy1982/archive/2012/04/06/2395729.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <svg width="200" height="250"> <rect x="10" y="10" width="30" height="30" stroke="black" fill="transparent" stroke-width="5"/> <rect x="60" y="10" rx="10" ry="10" width="30" height="30" stroke="black" fill="transparent" stroke-width="5"/> <circle cx="25" cy="75" r="20" stroke="red" fill="transparent" stroke-width="5"/> <ellipse cx="75" cy="75" rx="20" ry="5" stroke="red" fill="transparent" stroke-width="5"/> <line x1="10" x2="50" y1="110" y2="150" stroke="orange" fill="transparent" stroke-width="5"/> <polyline points="60 110 65 120 70 115 75 130 80 125 85 140 90 135 95 150 100 145" stroke="orange" fill="transparent" stroke-width="5"/> <polygon points="50 160 55 180 70 180 60 190 65 205 50 195 35 205 40 190 30 180 45 180" stroke="green" fill="transparent" stroke-width="5"/> <path d="M20,230 Q40,205 50,230 T90,230" fill="none" stroke="blue" stroke-width="5"/> </svg> |
矩形 - rect元素 x:矩形左上角的坐标(用户坐标系)的x值。 y:矩形左上角的坐标(用户坐标系)的y值。 width:矩形宽度。 height:矩形高度。 rx:实现圆角效果时,圆角沿x轴的半径。 ry:实现圆角效果时,圆角沿y轴的半径。
圆 - circle元素 r:圆的半径。 cx:圆心坐标x值。 cy:圆心坐标y值。
椭圆 - ellipse元素 rx:半长轴(x半径)。 ry:半短轴(y半径)。 cx:圆心坐标x值。 cy:圆心坐标y值。
直线 - line元素 x1:起点x坐标。 y1:起点y坐标。 x2:终点x坐标。 y2:终点y坐标。
折线 - polyline元素 points:一系列的用空格,逗号,换行符等分隔开的点。每个点必须有2个数字:x值和y值。所以下面3个点 (0,0), (1,1)和(2,2)可以写成:"0 0, 1 1, 2 2"。
多边形 - polygon元素 points:一系列的用空格,逗号,换行符等分隔开的点。每个点必须有2个数字:x值和y值。所以下面3个点 (0,0), (1,1)和(2,2)可以写成:"0 0, 1 1, 2 2"
路径 - path元素 这个是最通用,最强力的元素了;使用这个元素你可以实现任何其他的图形,不仅包括上面这些基本形状,也可以实现像贝塞尔曲线那样的复杂形状;此外,使用path可以实现平滑的过渡线段,虽然也可以使用polyline来实现这种效果,但是需要提供的点很多,而且放大了效果也不好。这个元素控制位置和形状的只有一个参数 d:一系列绘制指令和绘制参数(点)组合成。 绘制指令分为绝对坐标指令和相对坐标指令两种,这两种指令使用的字母是一样的,就是大小写不一样,绝对指令使用大写字母,坐标也是绝对坐标;相对指令使用对应的小写字母,点的坐标表示的都是偏移量。
绝对坐标绘制指令
这组指令的参数代表的是绝对坐标。假设当前画笔所在的位置为(x0,y0),则下面的绝对坐标指令代表的含义如下所示:
1 2 3 4 5 6 7 8 9 10 11 | 指令 参数 说明 M x, y 将画笔移动到点(x,y) L x y 画笔从当前的点绘制线段到点(x,y) H x 画笔从当前的点绘制水平线段到点(x,y0) V y 画笔从当前的点绘制竖直线段到点(x0,y) A rx ry x-axis-rotation large-arc-flag sweep-flag x y 画笔从当前的点绘制一段圆弧到点(x,y) C x1 y1, x2 y2, x y 画笔从当前的点绘制一段三次贝塞尔曲线到点(x,y) S x2 y2, x y 特殊版本的三次贝塞尔曲线(省略第一个控制点) Q x1 y1, x y 绘制二次贝塞尔曲线到点(x,y) T x y 特殊版本的二次贝塞尔曲线(省略控制点) Z 无参数 绘制闭合图形,如果d属性不指定Z命令,则绘制线段,而不是封闭图形。 |
移动画笔指令M,画直线指令:L,H,V,闭合指令Z都比较简单 绘制圆弧指令:A rx ry x-axis-rotation large-arc-flag sweep-flag x y rx,ry 是弧的半长轴、半短轴长度 x-axis-rotation 是此段弧所在的x轴与水平方向的夹角,即x轴的逆时针旋转角度,负数代表顺时针转动的角度。 large-arc-flag 为1 表示大角度弧线,0 代表小角度弧线。 sweep-flag 为1代表从起点到终点弧线绕中心顺时针方向,0 代表逆时针方向。 x,y 是弧终端坐标。
1 2 3 4 5 6 7 8 9 | <svg width="320px" height="320px"> <path d="M 10 315 L 110 215 A 30 50 0 0 1 162.55 162.45 L 172.55 152.45 A 30 50 -45 0 1 215.1 109.9 L 315 10" stroke="black" fill="green" stroke-width="2" fill-opacity="0.5"/> </svg> |
绘制三次贝塞尔曲线指令:C x1 y1, x2 y2, x y 三次贝塞尔曲线有两个控制点,就是(x1,y1)和(x2,y2),最后面(x,y)代表曲线的终点。体会下面的例子:
1 2 3 4 5 6 7 8 9 10 11 | <svg width="190px" height="160px"> <path d="M10 10 C 20 20, 40 20, 50 10" stroke="black" fill="transparent"/> <path d="M70 10 C 70 20, 120 20, 120 10" stroke="black" fill="transparent"/> <path d="M130 10 C 120 20, 180 20, 170 10" stroke="black" fill="transparent"/> <path d="M10 60 C 20 80, 40 80, 50 60" stroke="black" fill="transparent"/> <path d="M70 60 C 70 80, 110 80, 110 60" stroke="black" fill="transparent"/> <path d="M130 60 C 120 80, 180 80, 170 60" stroke="black" fill="transparent"/> <path d="M10 110 C 20 140, 40 140, 50 110" stroke="black" fill="transparent"/> <path d="M70 110 C 70 140, 110 140, 110 110" stroke="black" fill="transparent"/> <path d="M130 110 C 120 140, 180 140, 170 110" stroke="black" fill="transparent"/> </svg> |
特殊版本的三次贝塞尔曲线:S x2 y2, x y 很多时候,为了绘制平滑的曲线,需要多次连续绘制曲线。这个时候,为了平滑过渡,常常第二个曲线的控制点是第一个曲线控制点在曲线另外一边的映射点。这个时候可以使用这个简化版本。这里要注意的是,如果S指令前面没有其他的S指令或C指令,这个时候会认为两个控制点是一样的,退化成二次贝塞尔曲线的样子;如果S指令是用在另外一个S指令或者C指令后面,这个时候后面这个S指令的第一个控制点会默认设置为前面的这个曲线的第二个控制点的一个映射点,体会一下:
1 2 3 | <svg width="190px" height="160px"> <path d="M10 80 C 40 10, 65 10, 95 80 S 150 150, 180 80" stroke="black" fill="transparent"/> </svg> |
绘制二次贝塞尔曲线指令:Q x1 y1, x y , T x y (特殊版本的二次贝塞尔曲线) 如果是连续的绘制曲线,同样可以使用简化版本T。同样的,只有T前面是Q或者T指令的时候,后面的T指令的控制点会默认设置为前面的曲线的控制点的映射点,体会一下:
1 2 3 | <svg width="190px" height="160px"> <path d="M10 80 Q 52.5 10, 95 80 T 180 80" stroke="black" fill="transparent"/> </svg> |
SVG path绘制注意事项
绘制带孔的图形时要注意:外层边的绘制需要是逆时针顺序的,里面的洞的边的顺序必须是顺时针的。只有这样绘制的图形填充效果才会正确。
1 2 3 4 5 | <svg> <rect width="300" height="200" fill="red" /> <circle cx="150" cy="100" r="80" fill="green" /> <text x="150" y="125" font-size="60" text-anchor="middle" fill="white">SVG</text> </svg> |
如上面的例子中所示,text元素可以设置下列的属性:
x,y是文本位置坐标。 text-anchor是文本显示的方向,其实也就是位置(x,y)处于文本的位置。这个属性有start,middle和end三种值。 start表示文本位置坐标(x,y)位于文本的开始处,文本从这点开始向右挨个显示。 middle表示(x,y)位于文本中间处,文本向左右两个方向显示,其实就是居中显示。 end表示(x,y)点位于文本结尾,文本向左挨个显示。 除了这些属性,下面的这些属性都既可以在CSS中指定,也可以直接在属性中指定:
fill,stroke:填充和描边颜色,具体使用在后面总结。 font的相关属性:font-family, font-style, font-weight, font-variant, font-stretch, font-size, font-size-adjust, kerning, letter-spacing, word-spacing and text-decoration。
SVG中渲染图片 - image元素
1 | SVG中的image元素可以直接支持显示光栅图片,使用很简单
|
这里需要注意几点: 1.如果没有设置x或y坐标,则默认是0。 2.如果没有设置width或height,则默认也是0. 3.如果显式的设置width或height为0,则会禁止渲染这幅图片。 4.图片的格式支持png,jpeg,jpg,svg等等,所以svg是支持嵌套svg的。 5.image与其他元素一样,是svg的常规元素,所以它支持所有的裁剪,蒙板,滤镜,旋转等效果。
填充色 - fill属性
1 2 | <rect x="10" y="10" width="100" height="100" stroke="blue" fill="red" fill-opacity="0.5" stroke-opacity="0.8"/> |
上面例子中画了一个红色蓝边的矩形。注意几点:
边框色 - stroke属性
很有意思的小插件:Bootstrap 表单构造器 http://www.bootcss.com/p/bootstrap-form-builder/
]]>它定义了地图属性数据的表结构,包括字段数、字段名称、字段类型和字段宽度、索引字段及相应图层的一些关键空间信息描述。TAB 文件实际上是一个文本文件。
它用于存放完整的地图属性数据。包括文件头,表结构描述,及各条属性数据记录。
它记录了地图中每一个空间对象在空间数据文件中的位置指针。指针的列的顺序与属性数据文件中属性数据记录存放的顺序一致。它实际上是一个空间对象的定位表。
它包含了各地图对象的空间属性。比如对象的几何类型、坐标信息和颜色信息等。还描述了与空间对象的属性数据记录在属性数据文件中的记录号,当用户从地图上查询某一地图对象时,就能够地查到其相关的属性信息。
它不是必须的,只有当用户规定了数据库的索引字段后,系统才会自动产生它。
1 空间模型 GIS 将现实世界抽象为互相联接不同特征的层面(LAYER)组合
2 地理参考系 空间数据包括绝对位置信息(如经纬度坐标)以及相对位置信息(包括地址、编码、统计调查值等)
3 矢量和栅格数据结构 GIS 数据包括矢量和栅格两种基本模式。 矢量数据以点、线、面方式编码并以(x,y)坐标串存储管理,是表现离散空间特性的最佳方式。 栅格数据(扫描图像或照片)是通过一系列网格单元表达连续地理特征。
栅格图像是由一行行细小的点(像素)组成,所以也可以称之为位图,是后续工作即图层分解的基础,又称作基图。
1, 扫描仪/数码相机 2, 通过图形软件包将图像保存或转存为栅格文件格式,tif 3, 在MapInfo 购买已配准的栅格图像。
1 2 3 4 5 6 7 | 1,gif 2, jpg 3, tif 4, pcx 5, bmp 6, tda 7, bll (spot卫星图像) |
地图对象:点,线,面
1 Web Services: 从MapInfo Server获取数据,此处的Web Server指的是“MapInfo公司”的Web服务 2 DBMS ODBC联接数据库 3 Drawing 在Image上画图工具 4 Main 各种Select工具:图片、工作区、图层、link
Data file choices include:
1 2 3 4 5 6 7 8 9 10 11 | • Microsoft Access • Microsoft Excel • dBASE DBF • ESRI ® shapefiles • Raster Images • Grid Images • ASCII Delimited Text ---》二进制文件,格式? • Lotus 1–2–3 • Remote Databases (Oracle, SQL Server, PostGIS) • Workspace • Comma Delimited CSV files |
So we can choose Excel/CSV as Data Source
During the .TAB file creation process, the original file is in no way altered. The file retains its original properties.
支持栅格图像,需要 License
When you bring in a raster image to MapInfo Professional, you may need to register it (specify its map coordinates) so MapInfo Professional can display it properly. Choosing the Raster Image file format from the Open dialog box will bring you to the Image Registration dialog box where you can specify the appropriate map coordinates. Once you register the image, a process that creates a .TAB file for the image, you can open it as you would open any table in a Map window. Images that you purchase from MapInfo Professional will already be registered.
For a full discussion of raster image display, see Registering SPOT Images in the Help System.
There is a two Giga Byte (2 GB) file limit on these files MapInfo Professional
A .WOR file is MapInfo Professional workspace file containing un-compiled MapBasic code that MapInfo Professional interprets to open a session with tables, windows, and settings the way a user left it. It is MapInfo Professional version specific depending on features used in it. A .MWS workspace file originates from the MapXtreme product line and is comprised of XML code to do things similar to a MapInfo Workspace *.wor with some limitations.
whithout License, the layer can not be edited.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | • Windows Bitmap (*.BMP) • Windows Metafile (*.WMF) • Windows Enhanced Metafile (*.EMF) • EMF + Metafile (*.EMF) • EMF + Dual Metafile (*.EMF) • JPEG File Interchange Format (*.JPG) • JPEG 2000 (*.JP2) • Portable Network Graphics Format (*.PNG) • Tagged Image File Format (*TIF) • TIFF CMYK (*.TIF) • TIFF CCITT Group 4 (*.TIF) • TIFF LZW (*.TIF) • Graphic Interchange Format (*.GIF) • Photoshop 3.0 (*.PSD) |
mporting and Exporting Data in AutoCAD Format
]]>Web Services – Sets refresh, timeout values, server options and other default settings for Proxy Servers, WMS, WFS, Geocode server, Drivetime server, and Map Tile server web services.
solr是基于lucene开发包而搭建起来的一个依赖于Servlet容器的一个全文检索组件,他可以为自己的 web应用提供简单的检索服务,也可以搭建复杂的集群环境进行全文检索,例如如果索引文件很大大概 有7-90GB的索引文件就需要做分布式了应为这样的数据量一台机器的检索数据的速度太慢,如果需要 进行集群demo测试可以在本机多开启几个web应用服务器就可以了。
Solr4 结构图
server.xml
1 2 3 | <Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" URIEncoding="UTF-8"/>
|
apache-solr-core-3.2.0.jar apache-solr-solrj-3.2.0.jar-->用于提供测试solr全文检索的java客户端
solr.home
的主要目录结构可以以任意名字的目录进行配置, 但是在这个目录的里面需要按照solr的要求来配置 在xxx目录里面有bin, data, conf和solr.xml三个目录和一个配置文件,这个配置文件用于 配置CoreContainer容器的多实例通过这个配置实现
在bin
里面是solr插件的第三方jar包; 也就是你需要为solr添加的插件包放在这个里面
data
下面有index,和spellcheck两个目录分别是存放索引文件和拼写检查的什么东西
在conf
下面全部是与当前solrcore实例相关的一些配置文件,例如 solrconfig.xml
, schema.xml
scripts.conf
这三个文件是最重要的;
上面已经配置好solr的木要目录结构然后只要让solrDispatchFilter在实例化的时候找到这个目录就能够初始化我们的solr.home了
可以在过滤器或者监听器构造函数中通过设置jvm系统参数的形式
1 | System.setProperty("solr.solr.home","dir"); |
web.xml中进行描述,上面已经复制过通过在tomcat的JNDI容器中进行指定就是那个context.xml中
在web.xml中配置SolrDispatchFilter.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | <!-- 过滤所有与solr相关的http请求 --> <filter> <filter-name>SolrRequestFilter</filter-name> <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class> <init-param> <!-- 通过一特定开头的字符标志需要solr服务 --> <param-name>path-prefix</param-name> <param-value>solrservice</param-value> </init-param> </filter> <filter-mapping> <filter-name>SolrRequestFilter</filter-name> <url-pattern>/*</url-pattern> </filter-mapping> |
这是jar包中所包含的包:
org.apache.solr.analysis
这个包主要解决分词问题,学习solr之前必须对lucene的简单结构 进行了解如何分词lucene中已经提到过
org.apache.solr.client.solrj.embedded
solrj主要提供方便的方式去应用solr全文检索的功能
org.apache.solr.core
包含的主要类介绍 org.apache.solr.core.CoreContainer.java, org.apache.solr.core.Config.java, org.apache.solr.core.SolrConfig.java, org.apache.solr.core.SolrCore.java org.apache.solr.core.CoreDescriptor.java, org.apache.solr.core.DirectoryFactory.java, org.apache.solr.core.SolrResourceLoader.java
不管你有多少个solrcore实例那么solrcore的实例都会存放在我们的CoreContainer的
1 | protected final Map<String, SolrCore> cores = new LinkedHashMap<String, SolrCore>(); |
这个属性中也就是用于保存solrcore实例,那么solrCore是如何被创建的,他是根据SolrConfig创建的
solrConfig继承自Config类Config类提供的xPath的方式去解析xml文件
solrConfig对应到我们的SolrConfig.xml文件,然后SolrResourceLoader是用来加载配置文件的并且用于定位solr.home
solr.home在SolrResourceLoader是怎么定位到solr.home
的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | public static String locateSolrHome() { String home = null; // 功过JNDI的方式定位到solrHome的主要目录 // 他可以在web.xml中这样配置 //<env-entry> //<env-entry-name>solr/home</env-entry-name> //<env-entry-value>C:\\apache-tomcat-6.0.35-windows-x64\\apache-tomcat-6.0.35\\webapps\\solr</env-entry-value> //<env-entry-type>java.lang.String</env-entry-type> //</env-entry> try { Context c = new InitialContext(); home = (String)c.lookup("java:comp/env/"+project+"/home"); log.info("Using JNDI solr.home: "+home ); } catch (NoInitialContextException e) { log.info("JNDI not configured for "+project+" (NoInitialContextEx)"); } catch (NamingException e) { log.info("No /"+project+"/home in JNDI"); } catch( RuntimeException ex ) { log.warn("Odd RuntimeException while testing for JNDI: " + ex.getMessage()); } // 如果JNDI没有找到主要目录的话就会尝试在System.getProperty("solr.solr.home"); // 也就是JVM的系统参数中进行一个查找只要是在同一个JVM中都是可以找到的 // 你也可以在某个监听器或者过滤器中指定solrhome他是这样指定的 // 需要注意的是你的过滤器必须要在SolrDispatchFilter.java这个过滤器之前因为 // 容器初始化过滤器的时候是根据配置文件的顺序进行实例话的 // 在SolrDispatchFilter.java过滤器实例化的时候进行了很多的solr初始化服务包括容器的初始化等 // 所以必须要在这个过滤器实例化之前指定solrhome的主要目录 // System.setProperty("solr.solr.home","dir"); if( home == null ) { String prop = project + ".solr.home"; home = System.getProperty(prop); if( home != null ) { log.info("using system property "+prop+": " + home ); } } // 如果上面的两种方式都没有找到就会到web应用的主目录下就行查找 // 如果你的web应用名字叫test的话他就到test目录下去找solr这个目录 if( home == null ) { home = project + '/'; log.info(project + " home defaulted to '" + home + "' (could not find system property or JNDI)"); } return normalizeDir( home ); } |
定位到solr.home
的主要目录后也就可以根据特定的目录层次结构进行加载配置文件然后实例容器然后向容器中添加solrcore实例了。
CoreDescriptor这个类主要是用来描述一个solrcore实例特征的相当于介绍这个solrcore有几个处理器,并且 有chema.xml文件在那个位置然后这个solrcore的名字叫什么他的数据索引文件存放在哪个位置等... DirectoryFactory这个类对应的lucene中的FSDirectory他是找到所有索引文件所在的目录。
org.apache.solr.handler
这个包中有提供了大部分solr提供的服务功能的实现 处理器也就是处理提出的各种不同的solr的全文检索服务要求 包括crud和控制我们的容器对象和solrcore实例对象的一些处理方式
1 | org.apache.solr.handler.admin.AdminHandlers.java |
AdminHandlers相当于solr.home的主人,他将掌管这个家庭的成员也就是solrcore实例, 等多方面的功能。
org.apache.solr.handler.ReplicationHandler.java
ReplicationHandler需要在做solr分布式的时候用到,在默认的情况下solr的solrconfig.xml
主机和从机上的索引文件是一样的他是通过他们两个之间的版本号的差别来进行一个两个索引库之间的索引同步操作来实现solr的分布式索引库的维护,这个目的主要是为了提供更好和更快速的检索服务,
主机用于提供维护操作,而从机用于提供检索服务,我们还可以提供多个这样的组合,并且对外提供一台检索服务的机器,然后当有查询请求的时候,通过这台主机将查询语句发送到下面所有的从机上去.然后将这些查询结构做一个统一,
这样的功能solr已经提供的非常好了,只是如何去搭建大的分布式检索架构需要考虑一下主机的配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | <requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="master"> <str name="replicateAfter">commit</str> <!--在索引分发后自动提交 --> <str name="httpConnTimeout">50000000</str> <!-- 当索引分连接达到这么长的时间后连接超时--> <str name="httpReadTimeout">10000000</str> <!-- 当读取索引文件达到这个时间的时候操作超时--> <str name="confFiles">schema.xml,stopwords.txt,elevate.xml</str> <!-- 当有从机有要求和主机做同步的时候需要分发的配置文件 --> <str name="commitReserveDuration">00:05:00</str> <!-- 提交受保护的时间段长度 --> </lst> </requestHandler> 从机的配置: <requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="slave"> <str name="masterUrl">http://localhost:8080/mysolr/collection1/replication</str> <!--translate is:http://host_name:port/webapp_name/solrcore_name/replicationHandler_name/ --> <str name="pollInterval">00:05:00</str><!-- 间隔多长时间到master上去检测索引的version和generation--> <str name="compression">internal</str> <str name="httpConnTimeout">50000000</str> <str name="httpReadTimeout">10000000</str> <!-- <str name="httpBasicAuthUser">123</str> <str name="httpBasicAuthPassword">123</str> --><!-- 为了安全可以设置可以进行这些操作的请求的密码只有密码正确的请求才能执行分发操作--> </lst> </requestHandler> |
org.apache.solr.handler.RequestHandlerBase.java
他是所有Handler的基类,他提供了最基本的操作方式 其中重要的方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | // 用于处理处理solr的服务请求操作 public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp) { numRequests++; try { SolrPluginUtils.setDefaults(req,defaults,appends,invariants); rsp.setHttpCaching(httpCaching); handleRequestBody( req, rsp ); // count timeouts NamedList header = rsp.getResponseHeader(); if(header != null) { Object partialResults = header.get("partialResults"); boolean timedOut = partialResults == null ? false : (Boolean)partialResults; if( timedOut ) { numTimeouts++; rsp.setHttpCaching(false); } } } catch (Exception e) { SolrException.log(SolrCore.log,e); if (e instanceof ParseException) { e = new SolrException(SolrException.ErrorCode.BAD_REQUEST, e); } rsp.setException(e); numErrors++; } totalTime += rsp.getEndTime() - req.getStartTime(); } // 这个方法是一个抽象方法也就是真正的实现实在重写这个方法的子类身上进行实现了 public abstract void handleRequestBody( SolrQueryRequest req, SolrQueryResponse rsp ) throws Exception; // 在replicationHandler中的实现: @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { rsp.setHttpCaching(false); final SolrParams solrParams = req.getParams(); String command = solrParams.get(COMMAND);// 得到与这个solr实例需要进行分布式操作的命令 if (command == null) { rsp.add(STATUS, OK_STATUS); rsp.add("message", "No command"); return; } // This command does not give the current index version of the master // 这个命令不提供当前主机的索引版本的信息 // It gives the current 'replicateable' index version // 他提供当前需要复制的索引版本信息 if (command.equals(CMD_INDEX_VERSION)) { IndexCommit commitPoint = indexCommitPoint; // make a copy so it won't change if (commitPoint != null && replicationEnabled.get()) { // // There is a race condition here. The commit point may be changed / deleted by the time // we get around to reserving it. This is a very small window though, and should not result // in a catastrophic failure, but will result in the client getting an empty file list for // the CMD_GET_FILE_LIST command. // core.getDeletionPolicy().setReserveDuration(commitPoint.getVersion(), reserveCommitDuration); rsp.add(CMD_INDEX_VERSION, commitPoint.getVersion()); rsp.add(GENERATION, commitPoint.getGeneration()); } else { // This happens when replication is not configured to happen after startup and no commit/optimize // has happened yet. rsp.add(CMD_INDEX_VERSION, 0L); rsp.add(GENERATION, 0L); } } else if (command.equals(CMD_GET_FILE)) { getFileStream(solrParams, rsp); } else if (command.equals(CMD_GET_FILE_LIST)) { getFileList(solrParams, rsp); } else if (command.equalsIgnoreCase(CMD_BACKUP)) { doSnapShoot(new ModifiableSolrParams(solrParams), rsp,req); rsp.add(STATUS, OK_STATUS); } else if (command.equalsIgnoreCase(CMD_FETCH_INDEX)) { String masterUrl = solrParams.get(MASTER_URL); if (!isSlave && masterUrl == null) { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured or no 'masterUrl' Specified"); return; } final SolrParams paramsCopy = new ModifiableSolrParams(solrParams); new Thread() { @Override public void run() { doFetch(paramsCopy); } }.start(); rsp.add(STATUS, OK_STATUS); } else if (command.equalsIgnoreCase(CMD_DISABLE_POLL)) { if (snapPuller != null){ snapPuller.disablePoll(); rsp.add(STATUS, OK_STATUS); } else { rsp.add(STATUS, ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equalsIgnoreCase(CMD_ENABLE_POLL)) { if (snapPuller != null){ snapPuller.enablePoll(); rsp.add(STATUS, OK_STATUS); }else { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equalsIgnoreCase(CMD_ABORT_FETCH)) { if (snapPuller != null){ snapPuller.abortPull(); rsp.add(STATUS, OK_STATUS); } else { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equals(CMD_FILE_CHECKSUM)) { // this command is not used by anyone getFileChecksum(solrParams, rsp); } else if (command.equals(CMD_SHOW_COMMITS)) { rsp.add(CMD_SHOW_COMMITS, getCommits()); } else if (command.equals(CMD_DETAILS)) { rsp.add(CMD_DETAILS, getReplicationDetails(solrParams.getBool("slave",true))); RequestHandlerUtils.addExperimentalFormatWarning(rsp); } else if (CMD_ENABLE_REPL.equalsIgnoreCase(command)) { replicationEnabled.set(true); rsp.add(STATUS, OK_STATUS); } else if (CMD_DISABLE_REPL.equalsIgnoreCase(command)) { replicationEnabled.set(false); rsp.add(STATUS, OK_STATUS); } } |
为检索到的文本提供高亮服务的包;
org.apache.solr.request
这个包主要通过org.apache.solr.servlet.SolrRequestParser.java
类
根据HttpServletRequest对象的请求参数分析成符合solr应用的请求参数
主要方法是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | public SolrQueryRequest parse( SolrCore core, String path, HttpServletRequest req ) throws Exception { SolrRequestParser parser = standard; // TODO -- in the future, we could pick a different parser based on the request // Pick the parser from the request... ArrayList<ContentStream> streams = new ArrayList<ContentStream>(1); SolrParams params = parser.parseParamsAndFillStreams( req, streams ); SolrQueryRequest sreq = buildRequestFrom( core, params, streams ); // Handlers and login will want to know the path. If it contains a ':' // the handler could use it for RESTful URLs sreq.getContext().put( "path", path );// 在一个solr的请求处理中也有solr请求的上下文 // 应为处理这个请求的话这个请求将会被当作很多发放的参数所以设置一个方法的上下文环境 // 保存许多很重要的执行信息 return sreq; } |
org.apache.solr.response
关于solr的响应的信息的封装
org.apache.solr.schema
对应到封装schema.xml这个里面提供的一些功能的实现
org.apache.solr.search org.apache.solr.servlet org.apache.solr.servlet.SolrDispatchFilter.java
这个类是开启solr服务的关键过滤器一般需要在web.xml部署描述符中进行配置后 才能够提供solr全文检索服务,他的doFilter方法是所有solr服务的入口。
org.apache.solr.spelling org.apache.solr.update org.apache.solr.util
在solr的主目录下找到solr.xml进行配置就行了然后指定好solrcore 实例子所在的索引库以及配置还有依赖包所在的目录
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <solr persistent="false"> <!-- adminPath: RequestHandler path to manage cores. 管理员path:请求处理器所映射的管理路径然后去管理容器中的solrcore实例 If 'null' (or absent), cores will not be manageable via request handler 如果是null或者缺省的话,容器中的实例将不能通过请求处理器进行有效的管理 也就是同一个web应用下面有多个solrcore的应用在检索的时候需要指定你需要检索 的是那个solrcore: http://host_name:port/webapp_name/solrcore_name/admin/cores/ 这样就可以进行管理容器中的solrcore实例 --> <cores adminPath="/admin/cores" defaultCoreName="collection1"> <core name="collection1" instanceDir="." /> <core name="collection2" instanceDir="f:\\hh" /> </cores> </solr> |
首先找到需要进行分布式应用的solrcore实例的配置所在目录然后在solrconfig.xml中 将用于处理分布式服务的请求处理器的配置加上去,还需要在scripts.conf中进行指定一些 信息就可以了
master
1 2 3 4 5 6 7 8 9 | <requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="master"> <str name="replicateAfter">commit</str> <!--在索引分发后自动提交 --> <str name="httpConnTimeout">50000000</str> <!-- 当索引分连接达到这么长的时间后连接超时--> <str name="httpReadTimeout">10000000</str> <!-- 当读取索引文件达到这个时间的时候操作超时--> <str name="confFiles">schema.xml,stopwords.txt,elevate.xml</str> <!-- 当有从机有要求和主机做同步的时候需要分发的配置文件 --> <str name="commitReserveDuration">00:05:00</str> <!-- 提交受保护的时间段长度 --> </lst> </requestHandler> |
slave
1 2 3 4 5 6 7 8 9 10 11 12 | <requestHandler name="/replication" class="solr.ReplicationHandler" > <lst name="slave"> <str name="masterUrl">http://localhost:8080/mysolr/collection1/replication</str> <!--translate is:http://host_name:port/webapp_name/solrcore_name/replicationHandler_name/ --> <str name="pollInterval">00:05:00</str><!-- 间隔多长时间到master上去检测索引的version和generation--> <str name="compression">internal</str> <str name="httpConnTimeout">50000000</str> <str name="httpReadTimeout">10000000</str> <!-- <str name="httpBasicAuthUser">123</str> <str name="httpBasicAuthPassword">123</str> --><!-- 为了安全可以设置可以进行这些操作的请求的密码只有密码正确的请求才能执行分发操作--> </lst> </requestHandler> |
http://host_name:port/webapp_name/solrcore_name/someSolrServiceParameters
因为配置了SolrDispatchFilter过滤器所以这个请求肯定会通过这个过滤器的doFilter方法。 进入到doFilter方法后
1 首先这个过滤器做为一个普通的类他拥有Corecontainer这个容器属性也就是说所有的solrcore实例都在这个类中可以拿到。
那么这个拿到了什么都好说,并且他做的第一件事情是将这个coreContainer实例保存到了我们的请求中也就是HttpServletRequest对象的内置map属性中了,
那么通过这个过滤器后我们还可以访问到很多与solr相关的东西
request.setAttribute("org.apache.solr.CoreContainer", cores);
2 下一步是根据我们的httpServletRequest中的请求参数得到对应的solr请求对象也就是得到solrQueryRequest
他是通过一个解析器来得到solrQueryRequest这个实例的在SolrQueryRequest的parse方法中需要传入
public SolrQueryRequest parse( SolrCore core, String path, HttpServletRequest req )
3 然后返回一个solrQueryRequest,也就是将httpServletRequest对象中的参数转换成solr容器知道这个请求
需要什么东西的一个对象,请求什么服务的对象已经构造好了,然后就是构造怎么处理这个请求的一个请求处理器
这个请求处理器在solrQueryRequest中已经描述出来了,所以只需要在一个集合中根据键值对进行取得就可以了
这个处理器
handler = core.getRequestHandler( path );
在solrcore实例中已经配置好了许多这样的请求处理器只要从中拿一个就可以了。根据参数构造了需要什么样的服务的对象出来,并且针对这个服务的处理器出来了,
4 下一步就是执行了this.execute( req, handler, solrReq, solrRsp );
这一步是处理所有的solr服务的入口,如果想知道solr是怎么实现的一切都从这里开始。
如何以http请求方式进行主从索引文件同步
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | > master HTTP 管理 API: 启用复制, http://master_name:port/solr-master/replication?command=enablereplication 禁用复制, http://master_name:port/solr-master/replication?command=disablereplication 备份: http://master_name:port/solr-master/replication?command=backup > slave TTTP 管理 API: 复制索引, http://slave_host:port/solr-slave/replication?command=fetchindex 终止索引的复制, http://slave_host:port/solr-slave/replication?command=abortfetch 启动轮询复制索引,http://slave_host:port/solr-slave/replication?command=enablepoll 禁用轮询复制索引,http://slave_host:port/solr-slave/replication?command=disablepoll 索引详细, http://slave_host:port/solr-slave/replication?command=details > public TTTP 管理 API: 取索引版本号, http://host_name:port/solr-slave/replication?command=indexversion |
分布式处理的关键类:
1 2 3 | org.apache.solr.handler.ReplicationHandler.java org.apache.solr.handler.SnapPuller.java SnapPuller的内部类:SnapPuller$FileFetcher |
browser:发送get请求
1 | http://slave_host:port/solr-slave/replication?command=fetchindex |
经过SolrDispatchFilter过滤器的doFilter方法然后得到SolrQueryRequest对象和SolrRequestHandler对象实例,
让后调用this.execute( req, handler, solrReq, solrRsp );
方法进入ReplicationHandler内部的HandleRequest这个发放是从父类继承过滤在父类中调用了子类的HandlRequestBody方法然后开始处理分布式相关的操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { rsp.setHttpCaching(false); final SolrParams solrParams = req.getParams(); String command = solrParams.get(COMMAND); if (command == null) { rsp.add(STATUS, OK_STATUS); rsp.add("message", "No command"); return; } // This command does not give the current index version of the master // It gives the current 'replicateable' index version if (command.equals(CMD_INDEX_VERSION)) { IndexCommit commitPoint = indexCommitPoint; // make a copy so it won't change if (commitPoint != null && replicationEnabled.get()) { // // There is a race condition here. The commit point may be changed / deleted by the time // we get around to reserving it. This is a very small window though, and should not result // in a catastrophic failure, but will result in the client getting an empty file list for // the CMD_GET_FILE_LIST command. // core.getDeletionPolicy().setReserveDuration(commitPoint.getVersion(), reserveCommitDuration); rsp.add(CMD_INDEX_VERSION, commitPoint.getVersion()); rsp.add(GENERATION, commitPoint.getGeneration()); } else { // This happens when replication is not configured to happen after startup and no commit/optimize // has happened yet. rsp.add(CMD_INDEX_VERSION, 0L); rsp.add(GENERATION, 0L); } } else if (command.equals(CMD_GET_FILE)) { getFileStream(solrParams, rsp); } else if (command.equals(CMD_GET_FILE_LIST)) { getFileList(solrParams, rsp); } else if (command.equalsIgnoreCase(CMD_BACKUP)) { doSnapShoot(new ModifiableSolrParams(solrParams), rsp,req); rsp.add(STATUS, OK_STATUS); } else if (command.equalsIgnoreCase(CMD_FETCH_INDEX)) { String masterUrl = solrParams.get(MASTER_URL); if (!isSlave && masterUrl == null) { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured or no 'masterUrl' Specified"); return; } final SolrParams paramsCopy = new ModifiableSolrParams(solrParams); new Thread() { @Override public void run() { doFetch(paramsCopy);/** 进行关键的处理*/ } }.start(); rsp.add(STATUS, OK_STATUS); } else if (command.equalsIgnoreCase(CMD_DISABLE_POLL)) { if (snapPuller != null){ snapPuller.disablePoll(); rsp.add(STATUS, OK_STATUS); } else { rsp.add(STATUS, ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equalsIgnoreCase(CMD_ENABLE_POLL)) { if (snapPuller != null){ snapPuller.enablePoll(); rsp.add(STATUS, OK_STATUS); }else { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equalsIgnoreCase(CMD_ABORT_FETCH)) { if (snapPuller != null){ snapPuller.abortPull(); rsp.add(STATUS, OK_STATUS); } else { rsp.add(STATUS,ERR_STATUS); rsp.add("message","No slave configured"); } } else if (command.equals(CMD_FILE_CHECKSUM)) { // this command is not used by anyone getFileChecksum(solrParams, rsp); } else if (command.equals(CMD_SHOW_COMMITS)) { rsp.add(CMD_SHOW_COMMITS, getCommits()); } else if (command.equals(CMD_DETAILS)) { rsp.add(CMD_DETAILS, getReplicationDetails(solrParams.getBool("slave",true))); RequestHandlerUtils.addExperimentalFormatWarning(rsp); } else if (CMD_ENABLE_REPL.equalsIgnoreCase(command)) { replicationEnabled.set(true); rsp.add(STATUS, OK_STATUS); } else if (CMD_DISABLE_REPL.equalsIgnoreCase(command)) { replicationEnabled.set(false); rsp.add(STATUS, OK_STATUS); } } |
SolrRequest 的子类
用于控制一个web应用中的coreContainer容器如查询,添加实例,添加文档等对应到的有SolrResponse
$CATALINA_HOME\webapps\solr\solr\collection1\conf\schema.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 | <?xml version="1.0" encoding="UTF-8" ?> <schema name="example" version="1.5"> <!-- 有效的字段属性: name:强制的给字段指定一个名字 type: 从之前部分定义的类型中选择一个字段类型 indexed:字段是否应该被索引被索引后可以查询和排序 stored: true表示这个字段的值需要被保存和检索 multiValued:在一个文档中一个字段存在多个值要被索引 //被解析出来的时候就像 //<country> // <arr> // <str>中国</str> // <str>美国</str> // <str>德国</str> // </arr> // </country> omitNorms: false termVectors: [false] termOffsets: default:如果该字段没有被赋值指定一个默认的赋值 --> <fields> <field name="_version_" type="long" indexed="true" stored="true"/> <field name="_root_" type="string" indexed="true" stored="false"/> <!-- Only remove the "id" field if you have a very good reason to. While not strictly required, it is highly recommended. A <uniqueKey> is present in almost all Solr installations. See the <uniqueKey> declaration below where <uniqueKey> is set to "id". --> <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="sku" type="text_en_splitting_tight" indexed="true" stored="true" omitNorms="true"/> <field name="name" type="text_general" indexed="true" stored="true"/> <field name="manu" type="text_general" indexed="true" stored="true" omitNorms="true"/> > <field name="cat" type="string" indexed="true" stored="true" multiValued="true"/> > <field name="features" type="text_general" indexed="true" stored="true" multiValued="true"/> > <!-- multiValued="true" 指复合组件,表示该field的是有其他多个field组合而成,通过copyField实现--> <field name="includes" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" /> <field name="weight" type="float" indexed="true" stored="true"/> <field name="price" type="float" indexed="true" stored="true"/> <field name="popularity" type="int" indexed="true" stored="true" /> <field name="inStock" type="boolean" indexed="true" stored="true" /> <field name="store" type="location" indexed="true" stored="true"/> <!-- Common metadata fields, named specifically to match up with SolrCell metadata when parsing rich documents such as Word, PDF. Some fields are multiValued only because Tika currently may return multiple values for them. Some metadata is parsed from the documents, but there are some which come from the client context: "content_type": From the HTTP headers of incoming stream "resourcename": From SolrCell request param resource.name --> <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/> <field name="subject" type="text_general" indexed="true" stored="true"/> <field name="description" type="text_general" indexed="true" stored="true"/> <field name="comments" type="text_general" indexed="true" stored="true"/> <field name="author" type="text_general" indexed="true" stored="true"/> <field name="keywords" type="text_general" indexed="true" stored="true"/> <field name="category" type="text_general" indexed="true" stored="true"/> <field name="resourcename" type="text_general" indexed="true" stored="true"/> <field name="url" type="text_general" indexed="true" stored="true"/> <field name="content_type" type="string" indexed="true" stored="true" multiValued="true"/> <field name="last_modified" type="date" indexed="true" stored="true"/> <field name="links" type="string" indexed="true" stored="true" multiValued="true"/> <!-- Main body of document extracted by SolrCell. NOTE: This field is not indexed by default, since it is also copied to "text" using copyField below. This is to save space. Use this field for returning and highlighting document content. Use the "text" field to search the content. --> <field name="content" type="text_general" indexed="false" stored="true" multiValued="true"/> <!-- catchall field, containing all other searchable text fields (implemented via copyField further on in this schema --> <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/> <!-- catchall text field that indexes tokens both normally and in reverse for efficient leading wildcard queries. --> <field name="text_rev" type="text_general_rev" indexed="true" stored="false" multiValued="true"/> <!-- non-tokenized version of manufacturer to make it easier to sort or group results by manufacturer. copied from "manu" via copyField --> <field name="manu_exact" type="string" indexed="true" stored="false"/> <field name="payloads" type="payloads" indexed="true" stored="true"/> <!-- Some fields such as popularity and manu_exact could be modified to leverage doc values: <field name="popularity" type="int" indexed="true" stored="true" docValues="true" /> <field name="manu_exact" type="string" indexed="false" stored="false" docValues="true" /> <field name="cat" type="string" indexed="true" stored="true" docValues="true" multiValued="true"/> Although it would make indexing slightly slower and the index bigger, it would also make the index faster to load, more memory-efficient and more NRT-friendly. --> <!-- Dynamic field definitions allow using convention over configuration for fields via the specification of patterns to match field names. EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, z_i) RESTRICTION: the glob-like pattern in the name attribute must have a "*" only at the start or the end. --> <!-- 动态字段定义通过*来定义 <dynamicField name="*_ti" type="tint" indexed="true" stored="true"/> --> <dynamicField name="*_i" type="int" indexed="true" stored="true"/> <dynamicField name="*_is" type="int" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_s" type="string" indexed="true" stored="true" /> <dynamicField name="*_ss" type="string" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_l" type="long" indexed="true" stored="true"/> <dynamicField name="*_ls" type="long" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_t" type="text_general" indexed="true" stored="true"/> <dynamicField name="*_txt" type="text_general" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_en" type="text_en" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_b" type="boolean" indexed="true" stored="true"/> <dynamicField name="*_bs" type="boolean" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_f" type="float" indexed="true" stored="true"/> <dynamicField name="*_fs" type="float" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_d" type="double" indexed="true" stored="true"/> <dynamicField name="*_ds" type="double" indexed="true" stored="true" multiValued="true"/> <!-- Type used to index the lat and lon components for the "location" FieldType --> <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false" /> <dynamicField name="*_dt" type="date" indexed="true" stored="true"/> <dynamicField name="*_dts" type="date" indexed="true" stored="true" multiValued="true"/> <dynamicField name="*_p" type="location" indexed="true" stored="true"/> <!-- some trie-coded dynamic fields for faster range queries --> <dynamicField name="*_ti" type="tint" indexed="true" stored="true"/> <dynamicField name="*_tl" type="tlong" indexed="true" stored="true"/> <dynamicField name="*_tf" type="tfloat" indexed="true" stored="true"/> <dynamicField name="*_td" type="tdouble" indexed="true" stored="true"/> <dynamicField name="*_tdt" type="tdate" indexed="true" stored="true"/> <dynamicField name="*_pi" type="pint" indexed="true" stored="true"/> <dynamicField name="*_c" type="currency" indexed="true" stored="true"/> <dynamicField name="ignored_*" type="ignored" multiValued="true"/> <dynamicField name="attr_*" type="text_general" indexed="true" stored="true" multiValued="true"/> <dynamicField name="random_*" type="random" /> <!-- uncomment the following to ignore any fields that don't already match an existing field name or dynamic field, rather than reporting them as an error. alternately, change the type="ignored" to some other type e.g. "text" if you want unknown fields indexed and/or stored by default --> <!--dynamicField name="*" type="ignored" multiValued="true" /--> </fields> <!-- Field to use to determine and enforce document uniqueness. Unless this field is marked with required="false", it will be a required field --> > <uniqueKey>id</uniqueKey> > <!--可以看成是table的ID --> <!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when parsing a query string that isn't explicit about the field. Machine (non-user) generated queries are best made explicit, or they can use the "df" request parameter which takes precedence over this. Note: Un-commenting defaultSearchField will be insufficient if your request handler in solrconfig.xml defines "df", which takes precedence. That would need to be removed. <defaultSearchField>text</defaultSearchField> --> <!-- DEPRECATED: The defaultOperator (AND|OR) is consulted by various query parsers when parsing a query string to determine if a clause of the query should be marked as required or optional, assuming the clause isn't already marked by some operator. The default is OR, which is generally assumed so it is not a good idea to change it globally here. The "q.op" request parameter takes precedence over this. <solrQueryParser defaultOperator="OR"/> --> <!-- defaultOperator="AND|OR" 为查询解析器指定默认的查询单元关联符号 系统默认:<solrQueryParser defaultOperator="OR"/> 最好别改! --> <!-- copyField commands copy one field to another at the time a document is added to the index. It's used either to index the same field differently, or to add multiple fields to the same field for easier/faster searching. --> <!-- 复制字段命令在文档被添加到索引的时,复制一个字段到另外一个字段。 索引同一个字段的不同方式,添加多个字段到同一个字段为了快速简单的查询 <copyField source="cat" dest="text"/> 在添加文档的时候将cat这个字段的文本和text这个目标字段的文本索引到一起 <copyField source="*_t" dest="text" maxChars="3000"/> --> <copyField source="cat" dest="text"/> <copyField source="name" dest="text"/> <copyField source="manu" dest="text"/> <copyField source="features" dest="text"/> <copyField source="includes" dest="text"/> <copyField source="manu" dest="manu_exact"/> <!-- Copy the price into a currency enabled field (default USD) --> <copyField source="price" dest="price_c"/> <!-- Text fields from SolrCell to search by default in our catch-all field --> <copyField source="title" dest="text"/> <copyField source="author" dest="text"/> <copyField source="description" dest="text"/> <copyField source="keywords" dest="text"/> <copyField source="content" dest="text"/> <copyField source="content_type" dest="text"/> <copyField source="resourcename" dest="text"/> <copyField source="url" dest="text"/> <!-- Create a string version of author for faceting --> <copyField source="author" dest="author_s"/> <!-- Above, multiple source fields are copied to the [text] field. Another way to map multiple source fields to the same destination field is to use the dynamic field syntax. copyField also supports a maxChars to copy setting. --> <!-- <copyField source="*_t" dest="text" maxChars="3000"/> --> <!-- copy name to alphaNameSort, a field designed for sorting by name --> <!-- <copyField source="name" dest="alphaNameSort"/> --> <types> <!-- field type definitions. The "name" attribute is just a label to be used by field definitions. The "class" attribute and any other attributes determine the real behavior of the fieldType. Class names starting with "solr" refer to java classes in a standard package such as org.apache.solr.analysis --> <!-- The StrField type is not analyzed, but indexed/stored verbatim. It supports doc values but in that case the field needs to be single-valued and either required or have a default value. --> <fieldType name="string" class="solr.StrField" sortMissingLast="true" /> <!-- boolean type: "true" or "false" --> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/> <!-- sortMissingLast and sortMissingFirst attributes are optional attributes are currently supported on types that are sorted internally as strings and on numeric types. This includes "string","boolean", and, as of 3.5 (and 4.x), int, float, long, date, double, including the "Trie" variants. - If sortMissingLast="true", then a sort on this field will cause documents without the field to come after documents with the field, regardless of the requested sort order (asc or desc). - If sortMissingFirst="true", then a sort on this field will cause documents without the field to come before documents with the field, regardless of the requested sort order. - If sortMissingLast="false" and sortMissingFirst="false" (the default), then default lucene sorting will be used which places docs without the field first in an ascending sort and last in a descending sort. --> <!-- Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types. These fields support doc values, but they require the field to be single-valued and either be required or have a default value. --> <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/> <!-- Numeric field types that index each value at various levels of precision to accelerate range queries when the number of values between the range endpoints is large. See the javadoc for NumericRangeQuery for internal implementation details. Smaller precisionStep values (specified in bits) will lead to more tokens indexed per value, slightly larger index size, and faster range queries. A precisionStep of 0 disables indexing at different precision levels. --> <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/> <fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/> <!-- The format for this date field is of the form 1995-12-31T23:59:59Z, and is a more restricted form of the canonical representation of dateTime http://www.w3.org/TR/xmlschema-2/#dateTime The trailing "Z" designates UTC time and is mandatory. Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z All other components are mandatory. Expressions can also be used to denote calculations that should be performed relative to "NOW" to determine the value, ie... NOW/HOUR ... Round to the start of the current hour NOW-1DAY ... Exactly 1 day prior to now NOW/DAY+6MONTHS+3DAYS ... 6 months and 3 days in the future from the start of the current day Consult the DateField javadocs for more information. Note: For faster range queries, consider the tdate type --> <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/> <!-- A Trie based date field for faster date range queries and date faceting. --> <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/> <!--Binary data type. The data should be sent/retrieved in as Base64 encoded Strings --> <fieldtype name="binary" class="solr.BinaryField"/> <!-- Note: These should only be used for compatibility with existing indexes (created with lucene or older Solr versions). Use Trie based fields instead. As of Solr 3.5 and 4.x, Trie based fields support sortMissingFirst/Last Plain numeric field types that store and index the text value verbatim (and hence don't correctly support range queries, since the lexicographic ordering isn't equal to the numeric ordering) --> <fieldType name="pint" class="solr.IntField"/> <fieldType name="plong" class="solr.LongField"/> <fieldType name="pfloat" class="solr.FloatField"/> <fieldType name="pdouble" class="solr.DoubleField"/> <fieldType name="pdate" class="solr.DateField" sortMissingLast="true"/> <!-- The "RandomSortField" is not used to store or search any data. You can declare fields of this type it in your schema to generate pseudo-random orderings of your docs for sorting or function purposes. The ordering is generated based on the field name and the version of the index. As long as the index version remains unchanged, and the same field name is reused, the ordering of the docs will be consistent. If you want different psuedo-random orderings of documents, for the same version of the index, use a dynamicField and change the field name in the request. --> <fieldType name="random" class="solr.RandomSortField" indexed="true" /> <!-- solr.TextField allows the specification of custom text analyzers specified as a tokenizer and a list of token filters. Different analyzers may be specified for indexing and querying. The optional positionIncrementGap puts space between multiple fields of this type on the same document, with the purpose of preventing false phrase matching across fields. For more info on customizing your analyzer chain, please see http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters --> <!-- One can also specify an existing Analyzer class that has a default constructor via the class attribute on the analyzer element. Example: <fieldType name="text_greek" class="solr.TextField"> <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/> </fieldType> --> <!-- A text field that only splits on whitespace for exact matching of words --> <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> <!-- A general text field that has reasonable, generic cross-language defaults: it tokenizes with StandardTokenizer, removes stop words from case-insensitive "stopwords.txt" (empty by default), and down cases. At query time only, it also applies synonyms. --> <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <!-- in this example, we will only use synonyms at query time <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> --> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <!-- A text field with defaults appropriate for English: it tokenizes with StandardTokenizer, removes English stop words (lang/stopwords_en.txt), down cases, protects words from protwords.txt, and finally applies Porter's stemming. The query time analyzer also applies synonyms from synonyms.txt. --> <fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- in this example, we will only use synonyms at query time <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> --> <!-- Case insensitive stop word removal. --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory: <filter class="solr.EnglishMinimalStemFilterFactory"/> --> <filter class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPossessiveFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <!-- Optionally you may want to use this less aggressive stemmer instead of PorterStemFilterFactory: <filter class="solr.EnglishMinimalStemFilterFactory"/> --> <filter class="solr.PorterStemFilterFactory"/> </analyzer> </fieldType> <!-- A text field with defaults appropriate for English, plus aggressive word-splitting and autophrase features enabled. This field is just like text_en, except it adds WordDelimiterFilter to enable splitting and matching of words on case-change, alpha numeric boundaries, and non-alphanumeric chars. This means certain compound word cases will work, for example query "wi fi" will match document "WiFi" or "wi-fi". --> <fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <!-- in this example, we will only use synonyms at query time <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> --> <!-- Case insensitive stop word removal. --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> </analyzer> </fieldType> <!-- Less flexible matching, but less false matches. Probably not ideal for product names, but may be good for SKUs. Can insert dashes in the wrong place and still match. --> <fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.EnglishMinimalStemFilterFactory"/> <!-- this filter can remove any duplicate tokens that appear at the same position - sometimes possible with WordDelimiterFilter in conjuncton with stemming. --> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> <!-- Just like text_general except it reverses the characters of each token, to enable more efficient leading wildcard queries. --> <fieldType name="text_general_rev" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <!-- charFilter + WhitespaceTokenizer --> <!-- <fieldType name="text_char_norm" class="solr.TextField" positionIncrementGap="100" > <analyzer> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> --> <!-- This is an example of using the KeywordTokenizer along With various TokenFilterFactories to produce a sortable field that does not include some properties of the source text --> <fieldType name="alphaOnlySort" class="solr.TextField" sortMissingLast="true" omitNorms="true"> <analyzer> <!-- KeywordTokenizer does no actual tokenizing, so the entire input string is preserved as a single token --> <tokenizer class="solr.KeywordTokenizerFactory"/> <!-- The LowerCase TokenFilter does what you expect, which can be when you want your sorting to be case insensitive --> <filter class="solr.LowerCaseFilterFactory" /> <!-- The TrimFilter removes any leading or trailing whitespace --> <filter class="solr.TrimFilterFactory" /> <!-- The PatternReplaceFilter gives you the flexibility to use Java Regular expression to replace any sequence of characters matching a pattern with an arbitrary replacement string, which may include back references to portions of the original string matched by the pattern. See the Java Regular Expression documentation for more information on pattern and replacement string syntax. http://java.sun.com/j2se/1.6.0/docs/api/java/util/regex/package-summary.html --> <filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])" replacement="" replace="all" /> </analyzer> </fieldType> <fieldtype name="phonetic" stored="false" indexed="true" class="solr.TextField" > <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/> </analyzer> </fieldtype> <fieldtype name="payloads" stored="false" indexed="true" class="solr.TextField" > <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <!-- The DelimitedPayloadTokenFilter can put payloads on tokens... for example, a token of "foo|1.4" would be indexed as "foo" with a payload of 1.4f Attributes of the DelimitedPayloadTokenFilterFactory : "delimiter" - a one character delimiter. Default is | (pipe) "encoder" - how to encode the following value into a playload float -> org.apache.lucene.analysis.payloads.FloatEncoder, integer -> o.a.l.a.p.IntegerEncoder identity -> o.a.l.a.p.IdentityEncoder Fully Qualified class name implementing PayloadEncoder, Encoder must have a no arg constructor. --> <filter class="solr.DelimitedPayloadTokenFilterFactory" encoder="float"/> </analyzer> </fieldtype> <!-- lowercases the entire field value, keeping it as a single token. --> <fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> </fieldType> <!-- Example of using PathHierarchyTokenizerFactory at index time, so queries for paths match documents at that path, or in descendent paths --> <fieldType name="descendent_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> </fieldType> <!-- Example of using PathHierarchyTokenizerFactory at query time, so queries for paths match documents at that path, or in ancestor paths --> <fieldType name="ancestor_path" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="/" /> </analyzer> </fieldType> <!-- since fields of this type are by default not stored or indexed, any data added to them will be ignored outright. --> <fieldtype name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" /> <!-- This point type indexes the coordinates as separate fields (subFields) If subFieldType is defined, it references a type, and a dynamic field definition is created matching *___<typename>. Alternately, if subFieldSuffix is defined, that is used to create the subFields. Example: if subFieldType="double", then the coordinates would be indexed in fields myloc_0___double,myloc_1___double. Example: if subFieldSuffix="_d" then the coordinates would be indexed in fields myloc_0_d,myloc_1_d The subFields are an implementation detail of the fieldType, and end users normally should not need to know about them. --> <fieldType name="point" class="solr.PointType" dimension="2" subFieldSuffix="_d"/> <!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. --> <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/> <!-- An alternative geospatial field type new to Solr 4. It supports multiValued and polygon shapes. For more information about this and other Spatial fields new to Solr 4, see: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 --> <fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" geo="true" distErrPct="0.025" maxDistErr="0.000009" units="degrees" /> <!-- Money/currency field type. See http://wiki.apache.org/solr/MoneyFieldType Parameters: defaultCurrency: Specifies the default currency if none specified. Defaults to "USD" precisionStep: Specifies the precisionStep for the TrieLong field used for the amount providerClass: Lets you plug in other exchange provider backend: solr.FileExchangeRateProvider is the default and takes one parameter: currencyConfig: name of an xml file holding exchange rates solr.OpenExchangeRatesOrgProvider uses rates from openexchangerates.org: ratesFileLocation: URL or path to rates JSON file (default latest.json on the web) refreshInterval: Number of minutes between each rates fetch (default: 1440, min: 60) --> <fieldType name="currency" class="solr.CurrencyField" precisionStep="8" defaultCurrency="USD" currencyConfig="currency.xml" /> <!-- some examples for different languages (generally ordered by ISO code) --> <!-- Arabic --> <fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- for any non-arabic --> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ar.txt" /> <!-- normalizes ﻯ to ﻱ, etc --> <filter class="solr.ArabicNormalizationFilterFactory"/> <filter class="solr.ArabicStemFilterFactory"/> </analyzer> </fieldType> <!-- Bulgarian --> <fieldType name="text_bg" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_bg.txt" /> <filter class="solr.BulgarianStemFilterFactory"/> </analyzer> </fieldType> <!-- Catalan --> <fieldType name="text_ca" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- removes l', etc --> <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ca.txt"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ca.txt" /> <filter class="solr.SnowballPorterFilterFactory" language="Catalan"/> </analyzer> </fieldType> <!-- CJK bigram (see text_ja for a Japanese configuration using morphological analysis) --> <fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- normalize width before bigram, as e.g. half-width dakuten combine --> <filter class="solr.CJKWidthFilterFactory"/> <!-- for any non-CJK --> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.CJKBigramFilterFactory"/> </analyzer> </fieldType> <!-- Kurdish --> <fieldType name="text_ckb" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.SoraniNormalizationFilterFactory"/> <!-- for any latin text --> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ckb.txt"/> <filter class="solr.SoraniStemFilterFactory"/> </analyzer> </fieldType> <!-- Czech --> <fieldType name="text_cz" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_cz.txt" /> <filter class="solr.CzechStemFilterFactory"/> </analyzer> </fieldType> <!-- Danish --> <fieldType name="text_da" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_da.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Danish"/> </analyzer> </fieldType> <!-- German --> <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" format="snowball" /> <filter class="solr.GermanNormalizationFilterFactory"/> <filter class="solr.GermanLightStemFilterFactory"/> <!-- less aggressive: <filter class="solr.GermanMinimalStemFilterFactory"/> --> <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="German2"/> --> </analyzer> </fieldType> <!-- Greek --> <fieldType name="text_el" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- greek specific lowercase for sigma --> <filter class="solr.GreekLowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_el.txt" /> <filter class="solr.GreekStemFilterFactory"/> </analyzer> </fieldType> <!-- Spanish --> <fieldType name="text_es" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_es.txt" format="snowball" /> <filter class="solr.SpanishLightStemFilterFactory"/> <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/> --> </analyzer> </fieldType> <!-- Basque --> <fieldType name="text_eu" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_eu.txt" /> <filter class="solr.SnowballPorterFilterFactory" language="Basque"/> </analyzer> </fieldType> <!-- Persian --> <fieldType name="text_fa" class="solr.TextField" positionIncrementGap="100"> <analyzer> <!-- for ZWNJ --> <charFilter class="solr.PersianCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ArabicNormalizationFilterFactory"/> <filter class="solr.PersianNormalizationFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fa.txt" /> </analyzer> </fieldType> <!-- Finnish --> <fieldType name="text_fi" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fi.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Finnish"/> <!-- less aggressive: <filter class="solr.FinnishLightStemFilterFactory"/> --> </analyzer> </fieldType> <!-- French --> <fieldType name="text_fr" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- removes l', etc --> <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_fr.txt"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_fr.txt" format="snowball" /> <filter class="solr.FrenchLightStemFilterFactory"/> <!-- less aggressive: <filter class="solr.FrenchMinimalStemFilterFactory"/> --> <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="French"/> --> </analyzer> </fieldType> <!-- Irish --> <fieldType name="text_ga" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- removes d', etc --> <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_ga.txt"/> <!-- removes n-, etc. position increments is intentionally false! --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/hyphenations_ga.txt"/> <filter class="solr.IrishLowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ga.txt"/> <filter class="solr.SnowballPorterFilterFactory" language="Irish"/> </analyzer> </fieldType> <!-- Galician --> <fieldType name="text_gl" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_gl.txt" /> <filter class="solr.GalicianStemFilterFactory"/> <!-- less aggressive: <filter class="solr.GalicianMinimalStemFilterFactory"/> --> </analyzer> </fieldType> <!-- Hindi --> <fieldType name="text_hi" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <!-- normalizes unicode representation --> <filter class="solr.IndicNormalizationFilterFactory"/> <!-- normalizes variation in spelling --> <filter class="solr.HindiNormalizationFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hi.txt" /> <filter class="solr.HindiStemFilterFactory"/> </analyzer> </fieldType> <!-- Hungarian --> <fieldType name="text_hu" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hu.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Hungarian"/> <!-- less aggressive: <filter class="solr.HungarianLightStemFilterFactory"/> --> </analyzer> </fieldType> <!-- Armenian --> <fieldType name="text_hy" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_hy.txt" /> <filter class="solr.SnowballPorterFilterFactory" language="Armenian"/> </analyzer> </fieldType> <!-- Indonesian --> <fieldType name="text_id" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_id.txt" /> <!-- for a less aggressive approach (only inflectional suffixes), set stemDerivational to false --> <filter class="solr.IndonesianStemFilterFactory" stemDerivational="true"/> </analyzer> </fieldType> <!-- Italian --> <fieldType name="text_it" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <!-- removes l', etc --> <filter class="solr.ElisionFilterFactory" ignoreCase="true" articles="lang/contractions_it.txt"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_it.txt" format="snowball" /> <filter class="solr.ItalianLightStemFilterFactory"/> <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Italian"/> --> </analyzer> </fieldType> <!-- Japanese using morphological analysis (see text_cjk for a configuration using bigramming) NOTE: If you want to optimize search for precision, use default operator AND in your query parser config with <solrQueryParser defaultOperator="AND"/> further down in this file. Use OR if you would like to optimize for recall (default). --> <fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="false"> <analyzer> <!-- Kuromoji Japanese morphological analyzer/tokenizer (JapaneseTokenizer) Kuromoji has a search mode (default) that does segmentation useful for search. A heuristic is used to segment compounds into its parts and the compound itself is kept as synonym. Valid values for attribute mode are: normal: regular segmentation search: segmentation useful for search with synonyms compounds (default) extended: same as search mode, but unigrams unknown words (experimental) For some applications it might be good to use search mode for indexing and normal mode for queries to reduce recall and prevent parts of compounds from being matched and highlighted. Use <analyzer type="index"> and <analyzer type="query"> for this and mode normal in query. Kuromoji also has a convenient user dictionary feature that allows overriding the statistical model with your own entries for segmentation, part-of-speech tags and readings without a need to specify weights. Notice that user dictionaries have not been subject to extensive testing. User dictionary attributes are: userDictionary: user dictionary filename userDictionaryEncoding: user dictionary encoding (default is UTF-8) See lang/userdict_ja.txt for a sample user dictionary file. Punctuation characters are discarded by default. Use discardPunctuation="false" to keep them. See http://wiki.apache.org/solr/JapaneseLanguageSupport for more on Japanese language support. --> <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/> <!--<tokenizer class="solr.JapaneseTokenizerFactory" mode="search" userDictionary="lang/userdict_ja.txt"/>--> <!-- Reduces inflected verbs and adjectives to their base/dictionary forms (辞書形) --> <filter class="solr.JapaneseBaseFormFilterFactory"/> <!-- Removes tokens with certain part-of-speech tags --> <filter class="solr.JapanesePartOfSpeechStopFilterFactory" tags="lang/stoptags_ja.txt" /> <!-- Normalizes full-width romaji to half-width and half-width kana to full-width (Unicode NFKC subset) --> <filter class="solr.CJKWidthFilterFactory"/> <!-- Removes common tokens typically not useful for search, but have a negative effect on ranking --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ja.txt" /> <!-- Normalizes common katakana spelling variations by removing any last long sound character (U+30FC) --> <filter class="solr.JapaneseKatakanaStemFilterFactory" minimumLength="4"/> <!-- Lower-cases romaji characters --> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <!-- Latvian --> <fieldType name="text_lv" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_lv.txt" /> <filter class="solr.LatvianStemFilterFactory"/> </analyzer> </fieldType> <!-- Dutch --> <fieldType name="text_nl" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_nl.txt" format="snowball" /> <filter class="solr.StemmerOverrideFilterFactory" dictionary="lang/stemdict_nl.txt" ignoreCase="false"/> <filter class="solr.SnowballPorterFilterFactory" language="Dutch"/> </analyzer> </fieldType> <!-- Norwegian --> <fieldType name="text_no" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_no.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Norwegian"/> <!-- less aggressive: <filter class="solr.NorwegianLightStemFilterFactory" variant="nb"/> --> <!-- singular/plural: <filter class="solr.NorwegianMinimalStemFilterFactory" variant="nb"/> --> <!-- The "light" and "minimal" stemmers support variants: nb=Bokmål, nn=Nynorsk, no=Both --> </analyzer> </fieldType> <!-- Portuguese --> <fieldType name="text_pt" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" /> <filter class="solr.PortugueseLightStemFilterFactory"/> <!-- less aggressive: <filter class="solr.PortugueseMinimalStemFilterFactory"/> --> <!-- more aggressive: <filter class="solr.SnowballPorterFilterFactory" language="Portuguese"/> --> <!-- most aggressive: <filter class="solr.PortugueseStemFilterFactory"/> --> </analyzer> </fieldType> <!-- Romanian --> <fieldType name="text_ro" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ro.txt" /> <filter class="solr.SnowballPorterFilterFactory" language="Romanian"/> </analyzer> </fieldType> <!-- Russian --> <fieldType name="text_ru" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ru.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Russian"/> <!-- less aggressive: <filter class="solr.RussianLightStemFilterFactory"/> --> </analyzer> </fieldType> <!-- Swedish --> <fieldType name="text_sv" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_sv.txt" format="snowball" /> <filter class="solr.SnowballPorterFilterFactory" language="Swedish"/> <!-- less aggressive: <filter class="solr.SwedishLightStemFilterFactory"/> --> </analyzer> </fieldType> <!-- Thai --> <fieldType name="text_th" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ThaiWordFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_th.txt" /> </analyzer> </fieldType> <!-- Turkish --> <fieldType name="text_tr" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.TurkishLowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_tr.txt" /> <filter class="solr.SnowballPorterFilterFactory" language="Turkish"/> </analyzer> </fieldType> </types> <!-- Similarity is the scoring routine for each document vs. a query. A custom Similarity or SimilarityFactory may be specified here, but the default is fine for most applications. For more info: http://wiki.apache.org/solr/SchemaXml#Similarity --> <!-- <similarity class="com.example.solr.CustomSimilarityFactory"> <str name="paramkey">param value</str> </similarity> --> </schema> |
$CATALINA_HOME\webapps\solr\solr\collection1\conf\solrconfig.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 | <?xml version="1.0" encoding="UTF-8" ?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!-- For more details about configurations options that may appear in this file, see http://wiki.apache.org/solr/SolrConfigXml. --> <config> <!-- In all configuration below, a prefix of "solr." for class names is an alias that causes solr to search appropriate packages, including org.apache.solr.(search|update|request|core|analysis) You may also specify a fully qualified Java classname if you have your own custom plugins. --> <!-- Controls what version of Lucene various components of Solr adhere to. Generally, you want to use the latest version to get all bug fixes and improvements. It is highly recommended that you fully re-index after changing this setting as it can affect both how text is indexed and queried. --> <luceneMatchVersion>4.7</luceneMatchVersion> <!-- <lib/> directives can be used to instruct Solr to load an Jars identified and use them to resolve any "plugins" specified in your solrconfig.xml or schema.xml (ie: Analyzers, Request Handlers, etc...). All directories and paths are resolved relative to the instanceDir. Please note that <lib/> directives are processed in the order that they appear in your solrconfig.xml file, and are "stacked" on top of each other when building a ClassLoader - so if you have plugin jars with dependencies on other jars, the "lower level" dependency jars should be loaded first. If a "./lib" directory exists in your instanceDir, all files found in it are included as if you had used the following syntax... <lib dir="./lib" /> --> <!-- A 'dir' option by itself adds any files found in the directory to the classpath, this is useful for including all jars in a directory. When a 'regex' is specified in addition to a 'dir', only the files in that directory which completely match the regex (anchored on both ends) will be included. If a 'dir' option (with or without a regex) is used and nothing is found that matches, a warning will be logged. The examples below can be used to load some solr-contribs along with their external dependencies. --> <lib dir="../../../contrib/extraction/lib" regex=".*\.jar" /> <lib dir="../../../dist/" regex="solr-cell-\d.*\.jar" /> <lib dir="../../../contrib/clustering/lib/" regex=".*\.jar" /> <lib dir="../../../dist/" regex="solr-clustering-\d.*\.jar" /> <lib dir="../../../contrib/langid/lib/" regex=".*\.jar" /> <lib dir="../../../dist/" regex="solr-langid-\d.*\.jar" /> <lib dir="../../../contrib/velocity/lib" regex=".*\.jar" /> <lib dir="../../../dist/" regex="solr-velocity-\d.*\.jar" /> <!-- an exact 'path' can be used instead of a 'dir' to specify a specific jar file. This will cause a serious error to be logged if it can't be loaded. --> <!-- <lib path="../a-jar-that-does-not-exist.jar" /> --> <!-- Data Directory Used to specify an alternate directory to hold all index data other than the default ./data under the Solr home. If replication is in use, this should match the replication configuration. --> <dataDir>${solr.data.dir:}</dataDir> <!-- The DirectoryFactory to use for indexes. solr.StandardDirectoryFactory is filesystem based and tries to pick the best implementation for the current JVM and platform. solr.NRTCachingDirectoryFactory, the default, wraps solr.StandardDirectoryFactory and caches small files in memory for better NRT performance. One can force a particular implementation via solr.MMapDirectoryFactory, solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory. solr.RAMDirectoryFactory is memory based, not persistent, and doesn't work with replication. --> <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"> <!-- These will be used if you are using the solr.HdfsDirectoryFactory, otherwise they will be ignored. If you don't plan on using hdfs, you can safely remove this section. --> <!-- The root directory that collection data should be written to. --> <str name="solr.hdfs.home">${solr.hdfs.home:}</str> <!-- The hadoop configuration files to use for the hdfs client. --> <str name="solr.hdfs.confdir">${solr.hdfs.confdir:}</str> <!-- Enable/Disable the hdfs cache. --> <str name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</str> </directoryFactory> <!-- The CodecFactory for defining the format of the inverted index. The default implementation is SchemaCodecFactory, which is the official Lucene index format, but hooks into the schema to provide per-field customization of the postings lists and per-document values in the fieldType element (postingsFormat/docValuesFormat). Note that most of the alternative implementations are experimental, so if you choose to customize the index format, its a good idea to convert back to the official format e.g. via IndexWriter.addIndexes(IndexReader) before upgrading to a newer version to avoid unnecessary reindexing. --> <codecFactory class="solr.SchemaCodecFactory"/> <!-- To enable dynamic schema REST APIs, use the following for <schemaFactory>: <schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory> When ManagedIndexSchemaFactory is specified, Solr will load the schema from he resource named in 'managedSchemaResourceName', rather than from schema.xml. Note that the managed schema resource CANNOT be named schema.xml. If the managed schema does not exist, Solr will create it after reading schema.xml, then rename 'schema.xml' to 'schema.xml.bak'. Do NOT hand edit the managed schema - external modifications will be ignored and overwritten as a result of schema modification REST API calls. When ManagedIndexSchemaFactory is specified with mutable = true, schema modification REST API calls will be allowed; otherwise, error responses will be sent back for these requests. --> <schemaFactory class="ClassicIndexSchemaFactory"/> <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Index Config - These settings control low-level behavior of indexing Most example settings here show the default value, but are commented out, to more easily see where customizations have been made. Note: This replaces <indexDefaults> and <mainIndex> from older versions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> <indexConfig> <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a LimitTokenCountFilterFactory in your fieldType definition. E.g. <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/> --> <!-- Maximum time to wait for a write lock (ms) for an IndexWriter. Default: 1000 --> <!-- <writeLockTimeout>1000</writeLockTimeout> --> <!-- The maximum number of simultaneous threads that may be indexing documents at once in IndexWriter; if more than this many threads arrive they will wait for others to finish. Default in Solr/Lucene is 8. --> <!-- <maxIndexingThreads>8</maxIndexingThreads> --> <!-- Expert: Enabling compound file will use less files for the index, using fewer file descriptors on the expense of performance decrease. Default in Lucene is "true". Default in Solr is "false" (since 3.6) --> <!-- <useCompoundFile>false</useCompoundFile> --> <!-- ramBufferSizeMB sets the amount of RAM that may be used by Lucene indexing for buffering added documents and deletions before they are flushed to the Directory. maxBufferedDocs sets a limit on the number of documents buffered before flushing. If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush based on whichever limit is hit first. The default is 100 MB. --> <!-- <ramBufferSizeMB>100</ramBufferSizeMB> --> <!-- <maxBufferedDocs>1000</maxBufferedDocs> --> <!-- Expert: Merge Policy The Merge Policy in Lucene controls how merging of segments is done. The default since Solr/Lucene 3.3 is TieredMergePolicy. The default since Lucene 2.3 was the LogByteSizeMergePolicy, Even older versions of Lucene used LogDocMergePolicy. --> <!-- <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> <int name="maxMergeAtOnce">10</int> <int name="segmentsPerTier">10</int> </mergePolicy> --> <!-- Merge Factor The merge factor controls how many segments will get merged at a time. For TieredMergePolicy, mergeFactor is a convenience parameter which will set both MaxMergeAtOnce and SegmentsPerTier at once. For LogByteSizeMergePolicy, mergeFactor decides how many new segments will be allowed before they are merged into one. Default is 10 for both merge policies. --> <!-- <mergeFactor>10</mergeFactor> --> <!-- Expert: Merge Scheduler The Merge Scheduler in Lucene controls how merges are performed. The ConcurrentMergeScheduler (Lucene 2.3 default) can perform merges in the background using separate threads. The SerialMergeScheduler (Lucene 2.2 default) does not. --> <!-- <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/> --> <!-- LockFactory This option specifies which Lucene LockFactory implementation to use. single = SingleInstanceLockFactory - suggested for a read-only index or when there is no possibility of another process trying to modify the index. native = NativeFSLockFactory - uses OS native file locking. Do not use when multiple solr webapps in the same JVM are attempting to share a single index. simple = SimpleFSLockFactory - uses a plain file for locking Defaults: 'native' is default for Solr3.6 and later, otherwise 'simple' is the default More details on the nuances of each LockFactory... http://wiki.apache.org/lucene-java/AvailableLockFactories --> <lockType>${solr.lock.type:native}</lockType> <!-- Unlock On Startup If true, unlock any held write or commit locks on startup. This defeats the locking mechanism that allows multiple processes to safely access a lucene index, and should be used with care. Default is "false". This is not needed if lock type is 'single' --> <!-- <unlockOnStartup>false</unlockOnStartup> --> <!-- Expert: Controls how often Lucene loads terms into memory Default is 128 and is likely good for most everyone. --> <!-- <termIndexInterval>128</termIndexInterval> --> <!-- If true, IndexReaders will be opened/reopened from the IndexWriter instead of from the Directory. Hosts in a master/slave setup should have this set to false while those in a SolrCloud cluster need to be set to true. Default: true --> <!-- <nrtMode>true</nrtMode> --> <!-- Commit Deletion Policy Custom deletion policies can be specified here. The class must implement org.apache.lucene.index.IndexDeletionPolicy. The default Solr IndexDeletionPolicy implementation supports deleting index commit points on number of commits, age of commit point and optimized status. The latest commit point should always be preserved regardless of the criteria. --> <!-- <deletionPolicy class="solr.SolrDeletionPolicy"> --> <!-- The number of commit points to be kept --> <!-- <str name="maxCommitsToKeep">1</str> --> <!-- The number of optimized commit points to be kept --> <!-- <str name="maxOptimizedCommitsToKeep">0</str> --> <!-- Delete all commit points once they have reached the given age. Supports DateMathParser syntax e.g. --> <!-- <str name="maxCommitAge">30MINUTES</str> <str name="maxCommitAge">1DAY</str> --> <!-- </deletionPolicy> --> <!-- Lucene Infostream To aid in advanced debugging, Lucene provides an "InfoStream" of detailed information when indexing. Setting the value to true will instruct the underlying Lucene IndexWriter to write its info stream to solr's log. By default, this is enabled here, and controlled through log4j.properties. --> <infoStream>true</infoStream> </indexConfig> <!-- JMX This example enables JMX if and only if an existing MBeanServer is found, use this if you want to configure JMX through JVM parameters. Remove this to disable exposing Solr configuration and statistics to JMX. For more details see http://wiki.apache.org/solr/SolrJmx --> <jmx /> <!-- If you want to connect to a particular server, specify the agentId --> <!-- <jmx agentId="myAgent" /> --> <!-- If you want to start a new MBeanServer, specify the serviceUrl --> <!-- <jmx serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/> --> <!-- The default high-performance update handler --> <updateHandler class="solr.DirectUpdateHandler2"> <!-- Enables a transaction log, used for real-time get, durability, and and solr cloud replica recovery. The log can grow as big as uncommitted changes to the index, so use of a hard autoCommit is recommended (see below). "dir" - the target directory for transaction logs, defaults to the solr data directory. --> <updateLog> <str name="dir">${solr.ulog.dir:}</str> </updateLog> <!-- AutoCommit Perform a hard commit automatically under certain conditions. Instead of enabling autoCommit, consider using "commitWithin" when adding documents. http://wiki.apache.org/solr/UpdateXmlMessages maxDocs - Maximum number of documents to add since the last commit before automatically triggering a new commit. maxTime - Maximum amount of time in ms that is allowed to pass since a document was added before automatically triggering a new commit. openSearcher - if false, the commit causes recent index changes to be flushed to stable storage, but does not cause a new searcher to be opened to make those changes visible. If the updateLog is enabled, then it's highly recommended to have some sort of hard autoCommit to limit the log size. --> <autoCommit> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> <openSearcher>false</openSearcher> </autoCommit> <!-- softAutoCommit is like autoCommit except it causes a 'soft' commit which only ensures that changes are visible but does not ensure that data is synced to disk. This is faster and more near-realtime friendly than a hard commit. --> <autoSoftCommit> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> </autoSoftCommit> <!-- Update Related Event Listeners Various IndexWriter related events can trigger Listeners to take actions. postCommit - fired after every commit or optimize command postOptimize - fired after every optimize command --> <!-- The RunExecutableListener executes an external command from a hook such as postCommit or postOptimize. exe - the name of the executable to run dir - dir to use as the current working directory. (default=".") wait - the calling thread waits until the executable returns. (default="true") args - the arguments to pass to the program. (default is none) env - environment variables to set. (default is none) --> <!-- This example shows how RunExecutableListener could be used with the script based replication... http://wiki.apache.org/solr/CollectionDistribution --> <!-- <listener event="postCommit" class="solr.RunExecutableListener"> <str name="exe">solr/bin/snapshooter</str> <str name="dir">.</str> <bool name="wait">true</bool> <arr name="args"> <str>arg1</str> <str>arg2</str> </arr> <arr name="env"> <str>MYVAR=val1</str> </arr> </listener> --> </updateHandler> <!-- IndexReaderFactory Use the following format to specify a custom IndexReaderFactory, which allows for alternate IndexReader implementations. ** Experimental Feature ** Please note - Using a custom IndexReaderFactory may prevent certain other features from working. The API to IndexReaderFactory may change without warning or may even be removed from future releases if the problems cannot be resolved. ** Features that may not work with custom IndexReaderFactory ** The ReplicationHandler assumes a disk-resident index. Using a custom IndexReader implementation may cause incompatibility with ReplicationHandler and may cause replication to not work correctly. See SOLR-1366 for details. --> <!-- <indexReaderFactory name="IndexReaderFactory" class="package.class"> <str name="someArg">Some Value</str> </indexReaderFactory > --> <!-- By explicitly declaring the Factory, the termIndexDivisor can be specified. --> <!-- <indexReaderFactory name="IndexReaderFactory" class="solr.StandardIndexReaderFactory"> <int name="setTermIndexDivisor">12</int> </indexReaderFactory > --> <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Query section - these settings control query time things like caches ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --> <query> <!-- Max Boolean Clauses Maximum number of clauses in each BooleanQuery, an exception is thrown if exceeded. ** WARNING ** This option actually modifies a global Lucene property that will affect all SolrCores. If multiple solrconfig.xml files disagree on this property, the value at any given moment will be based on the last SolrCore to be initialized. --> <maxBooleanClauses>1024</maxBooleanClauses> <!-- Solr Internal Query Caches There are two implementations of cache available for Solr, LRUCache, based on a synchronized LinkedHashMap, and FastLRUCache, based on a ConcurrentHashMap. FastLRUCache has faster gets and slower puts in single threaded operation and thus is generally faster than LRUCache when the hit ratio of the cache is high (> 75%), and may be faster under other scenarios on multi-cpu systems. --> <!-- Filter Cache Cache used by SolrIndexSearcher for filters (DocSets), unordered sets of *all* documents that match a query. When a new searcher is opened, its caches may be prepopulated or "autowarmed" using data from caches in the old searcher. autowarmCount is the number of items to prepopulate. For LRUCache, the autowarmed items will be the most recently accessed items. Parameters: class - the SolrCache implementation LRUCache or (LRUCache or FastLRUCache) size - the maximum number of entries in the cache initialSize - the initial capacity (number of entries) of the cache. (see java.util.HashMap) autowarmCount - the number of entries to prepopulate from and old cache. --> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- Query Result Cache Caches results of searches - ordered lists of document ids (DocList) based on a query, a sort, and the range of documents requested. --> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- Document Cache Caches Lucene Document objects (the stored fields for each document). Since Lucene internal document ids are transient, this cache will not be autowarmed. --> <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <!-- custom cache currently used by block join --> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <!-- Field Value Cache Cache used to hold field values that are quickly accessible by document id. The fieldValueCache is created by default even if not configured here. --> <!-- <fieldValueCache class="solr.FastLRUCache" size="512" autowarmCount="128" showItems="32" /> --> <!-- Custom Cache Example of a generic cache. These caches may be accessed by name through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). The purpose is to enable easy caching of user/application level data. The regenerator argument should be specified as an implementation of solr.CacheRegenerator if autowarming is desired. --> <!-- <cache name="myUserCache" class="solr.LRUCache" size="4096" initialSize="1024" autowarmCount="1024" regenerator="com.mycompany.MyRegenerator" /> --> <!-- Lazy Field Loading If true, stored fields that are not requested will be loaded lazily. This can result in a significant speed improvement if the usual case is to not load all stored fields, especially if the skipped fields are large compressed text fields. --> <enableLazyFieldLoading>true</enableLazyFieldLoading> <!-- Use Filter For Sorted Query A possible optimization that attempts to use a filter to satisfy a search. If the requested sort does not include score, then the filterCache will be checked for a filter matching the query. If found, the filter will be used as the source of document ids, and then the sort will be applied to that. For most situations, this will not be useful unless you frequently get the same search repeatedly with different sort options, and none of them ever use "score" --> <!-- <useFilterForSortedQuery>true</useFilterForSortedQuery> --> <!-- Result Window Size An optimization for use with the queryResultCache. When a search is requested, a superset of the requested number of document ids are collected. For example, if a search for a particular query requests matching documents 10 through 19, and queryWindowSize is 50, then documents 0 through 49 will be collected and cached. Any further requests in that range can be satisfied via the cache. --> <queryResultWindowSize>20</queryResultWindowSize> <!-- Maximum number of documents to cache for any entry in the queryResultCache. --> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <!-- Query Related Event Listeners Various IndexSearcher related events can trigger Listeners to take actions. newSearcher - fired whenever a new searcher is being prepared and there is a current searcher handling requests (aka registered). It can be used to prime certain caches to prevent long request times for certain requests. firstSearcher - fired whenever a new searcher is being prepared but there is no current registered searcher to handle requests or to gain autowarming data from. --> <!-- QuerySenderListener takes an array of NamedList and executes a local query request for each NamedList in sequence. --> <listener event="newSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <!-- <lst><str name="q">solr</str><str name="sort">price asc</str></lst> <lst><str name="q">rocks</str><str name="sort">weight asc</str></lst> --> </arr> </listener> <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> <lst> <str name="q">static firstSearcher warming in solrconfig.xml</str> </lst> </arr> </listener> <!-- Use Cold Searcher If a search request comes in and there is no current registered searcher, then immediately register the still warming searcher and use it. If "false" then all requests will block until the first searcher is done warming. --> <useColdSearcher>false</useColdSearcher> <!-- Max Warming Searchers Maximum number of searchers that may be warming in the background concurrently. An error is returned if this limit is exceeded. Recommend values of 1-2 for read-only slaves, higher for masters w/o cache warming. --> <maxWarmingSearchers>2</maxWarmingSearchers> </query> <!-- Request Dispatcher This section contains instructions for how the SolrDispatchFilter should behave when processing requests for this SolrCore. handleSelect is a legacy option that affects the behavior of requests such as /select?qt=XXX handleSelect="true" will cause the SolrDispatchFilter to process the request and dispatch the query to a handler specified by the "qt" param, assuming "/select" isn't already registered. handleSelect="false" will cause the SolrDispatchFilter to ignore "/select" requests, resulting in a 404 unless a handler is explicitly registered with the name "/select" handleSelect="true" is not recommended for new users, but is the default for backwards compatibility --> <requestDispatcher handleSelect="false" > <!-- Request Parsing These settings indicate how Solr Requests may be parsed, and what restrictions may be placed on the ContentStreams from those requests enableRemoteStreaming - enables use of the stream.file and stream.url parameters for specifying remote streams. multipartUploadLimitInKB - specifies the max size (in KiB) of Multipart File Uploads that Solr will allow in a Request. formdataUploadLimitInKB - specifies the max size (in KiB) of form data (application/x-www-form-urlencoded) sent via POST. You can use POST to pass request parameters not fitting into the URL. addHttpRequestToContext - if set to true, it will instruct the requestParsers to include the original HttpServletRequest object in the context map of the SolrQueryRequest under the key "httpRequest". It will not be used by any of the existing Solr components, but may be useful when developing custom plugins. *** WARNING *** The settings below authorize Solr to fetch remote files, You should make sure your system has some authentication before using enableRemoteStreaming="true" --> <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" formdataUploadLimitInKB="2048" addHttpRequestToContext="false"/> <!-- HTTP Caching Set HTTP caching related parameters (for proxy caches and clients). The options below instruct Solr not to output any HTTP Caching related headers --> <httpCaching never304="true" /> <!-- If you include a <cacheControl> directive, it will be used to generate a Cache-Control header (as well as an Expires header if the value contains "max-age=") By default, no Cache-Control header is generated. You can use the <cacheControl> option even if you have set never304="true" --> <!-- <httpCaching never304="true" > <cacheControl>max-age=30, public</cacheControl> </httpCaching> --> <!-- To enable Solr to respond with automatically generated HTTP Caching headers, and to response to Cache Validation requests correctly, set the value of never304="false" This will cause Solr to generate Last-Modified and ETag headers based on the properties of the Index. The following options can also be specified to affect the values of these headers... lastModFrom - the default value is "openTime" which means the Last-Modified value (and validation against If-Modified-Since requests) will all be relative to when the current Searcher was opened. You can change it to lastModFrom="dirLastMod" if you want the value to exactly correspond to when the physical index was last modified. etagSeed="..." is an option you can change to force the ETag header (and validation against If-None-Match requests) to be different even if the index has not changed (ie: when making significant changes to your config file) (lastModifiedFrom and etagSeed are both ignored if you use the never304="true" option) --> <!-- <httpCaching lastModifiedFrom="openTime" etagSeed="Solr"> <cacheControl>max-age=30, public</cacheControl> </httpCaching> --> </requestDispatcher> <!-- Request Handlers http://wiki.apache.org/solr/SolrRequestHandler Incoming queries will be dispatched to a specific handler by name based on the path specified in the request. Legacy behavior: If the request path uses "/select" but no Request Handler has that name, and if handleSelect="true" has been specified in the requestDispatcher, then the Request Handler is dispatched based on the qt parameter. Handlers without a leading '/' are accessed this way like so: http://host/app/[core/]select?qt=name If no qt is given, then the requestHandler that declares default="true" will be used or the one named "standard". If a Request Handler is declared with startup="lazy", then it will not be initialized until the first request that uses it. --> <!-- SearchHandler http://wiki.apache.org/solr/SearchHandler For processing Search Queries, the primary Request Handler provided with Solr is "SearchHandler" It delegates to a sequent of SearchComponents (see below) and supports distributed queries across multiple shards --> <requestHandler name="/select" class="solr.SearchHandler"> <!-- default values for query parameters can be specified, these will be overridden by parameters in the request --> <lst name="defaults"> <str name="echoParams">explicit</str> <int name="rows">10</int> <str name="df">text</str> </lst> <!-- In addition to defaults, "appends" params can be specified to identify values which should be appended to the list of multi-val params from the query (or the existing "defaults"). --> <!-- In this example, the param "fq=instock:true" would be appended to any query time fq params the user may specify, as a mechanism for partitioning the index, independent of any user selected filtering that may also be desired (perhaps as a result of faceted searching). NOTE: there is *absolutely* nothing a client can do to prevent these "appends" values from being used, so don't use this mechanism unless you are sure you always want it. --> <!-- <lst name="appends"> <str name="fq">inStock:true</str> </lst> --> <!-- "invariants" are a way of letting the Solr maintainer lock down the options available to Solr clients. Any params values specified here are used regardless of what values may be specified in either the query, the "defaults", or the "appends" params. In this example, the facet.field and facet.query params would be fixed, limiting the facets clients can use. Faceting is not turned on by default - but if the client does specify facet=true in the request, these are the only facets they will be able to see counts for; regardless of what other facet.field or facet.query params they may specify. NOTE: there is *absolutely* nothing a client can do to prevent these "invariants" values from being used, so don't use this mechanism unless you are sure you always want it. --> <!-- <lst name="invariants"> <str name="facet.field">cat</str> <str name="facet.field">manu_exact</str> <str name="facet.query">price:[* TO 500]</str> <str name="facet.query">price:[500 TO *]</str> </lst> --> <!-- If the default list of SearchComponents is not desired, that list can either be overridden completely, or components can be prepended or appended to the default list. (see below) --> <!-- <arr name="components"> <str>nameOfCustomComponent1</str> <str>nameOfCustomComponent2</str> </arr> --> </requestHandler> <!-- A request handler that returns indented JSON by default --> <requestHandler name="/query" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="wt">json</str> <str name="indent">true</str> <str name="df">text</str> </lst> </requestHandler> <!-- realtime get handler, guaranteed to return the latest stored fields of any document, without the need to commit or open a new searcher. The current implementation relies on the updateLog feature being enabled. ** WARNING ** Do NOT disable the realtime get handler at /get if you are using SolrCloud otherwise any leader election will cause a full sync in ALL replicas for the shard in question. Similarly, a replica recovery will also always fetch the complete index from the leader because a partial sync will not be possible in the absence of this handler. --> <requestHandler name="/get" class="solr.RealTimeGetHandler"> <lst name="defaults"> <str name="omitHeader">true</str> <str name="wt">json</str> <str name="indent">true</str> </lst> </requestHandler> <!-- A Robust Example This example SearchHandler declaration shows off usage of the SearchHandler with many defaults declared Note that multiple instances of the same Request Handler (SearchHandler) can be registered multiple times with different names (and different init parameters) --> <requestHandler name="/browse" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- VelocityResponseWriter settings --> <str name="wt">velocity</str> <str name="v.template">browse</str> <str name="v.layout">layout</str> <str name="title">Solritas</str> <!-- Query settings --> <str name="defType">edismax</str> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 </str> <str name="df">text</str> <str name="mm">100%</str> <str name="q.alt">*:*</str> <str name="rows">10</str> <str name="fl">*,score</str> <str name="mlt.qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 </str> <str name="mlt.fl">text,features,name,sku,id,manu,cat,title,description,keywords,author,resourcename</str> <int name="mlt.count">3</int> <!-- Faceting defaults --> <str name="facet">on</str> <str name="facet.field">cat</str> <str name="facet.field">manu_exact</str> <str name="facet.field">content_type</str> <str name="facet.field">author_s</str> <str name="facet.query">ipod</str> <str name="facet.query">GB</str> <str name="facet.mincount">1</str> <str name="facet.pivot">cat,inStock</str> <str name="facet.range.other">after</str> <str name="facet.range">price</str> <int name="f.price.facet.range.start">0</int> <int name="f.price.facet.range.end">600</int> <int name="f.price.facet.range.gap">50</int> <str name="facet.range">popularity</str> <int name="f.popularity.facet.range.start">0</int> <int name="f.popularity.facet.range.end">10</int> <int name="f.popularity.facet.range.gap">3</int> <str name="facet.range">manufacturedate_dt</str> <str name="f.manufacturedate_dt.facet.range.start">NOW/YEAR-10YEARS</str> <str name="f.manufacturedate_dt.facet.range.end">NOW</str> <str name="f.manufacturedate_dt.facet.range.gap">+1YEAR</str> <str name="f.manufacturedate_dt.facet.range.other">before</str> <str name="f.manufacturedate_dt.facet.range.other">after</str> <!-- Highlighting defaults --> <str name="hl">on</str> <str name="hl.fl">content features title name</str> <str name="hl.encoder">html</str> <str name="hl.simple.pre"><b></str> <str name="hl.simple.post"></b></str> <str name="f.title.hl.fragsize">0</str> <str name="f.title.hl.alternateField">title</str> <str name="f.name.hl.fragsize">0</str> <str name="f.name.hl.alternateField">name</str> <str name="f.content.hl.snippets">3</str> <str name="f.content.hl.fragsize">200</str> <str name="f.content.hl.alternateField">content</str> <str name="f.content.hl.maxAlternateFieldLength">750</str> <!-- Spell checking defaults --> <str name="spellcheck">on</str> <str name="spellcheck.extendedResults">false</str> <str name="spellcheck.count">5</str> <str name="spellcheck.alternativeTermCount">2</str> <str name="spellcheck.maxResultsForSuggest">5</str> <str name="spellcheck.collate">true</str> <str name="spellcheck.collateExtendedResults">true</str> <str name="spellcheck.maxCollationTries">5</str> <str name="spellcheck.maxCollations">3</str> </lst> <!-- append spellchecking to our list of components --> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <!-- Update Request Handler. http://wiki.apache.org/solr/UpdateXmlMessages The canonical Request Handler for Modifying the Index through commands specified using XML, JSON, CSV, or JAVABIN Note: Since solr1.1 requestHandlers requires a valid content type header if posted in the body. For example, curl now requires: -H 'Content-type:text/xml; charset=utf-8' To override the request content type and force a specific Content-type, use the request parameter: ?update.contentType=text/csv This handler will pick a response format to match the input if the 'wt' parameter is not explicit --> <requestHandler name="/update" class="solr.UpdateRequestHandler"> <!-- See below for information on defining updateRequestProcessorChains that can be used by name on each Update Request --> <!-- <lst name="defaults"> <str name="update.chain">dedupe</str> </lst> --> </requestHandler> <!-- for back compat with clients using /update/json and /update/csv --> <requestHandler name="/update/json" class="solr.UpdateRequestHandler"> <lst name="defaults"> <str name="stream.contentType">application/json</str> </lst> </requestHandler> <requestHandler name="/update/csv" class="solr.UpdateRequestHandler"> <lst name="defaults"> <str name="stream.contentType">application/csv</str> </lst> </requestHandler> <!-- Solr Cell Update Request Handler http://wiki.apache.org/solr/ExtractingRequestHandler --> <requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="lowernames">true</str> <str name="uprefix">ignored_</str> <!-- capture link hrefs but ignore div attributes --> <str name="captureAttr">true</str> <str name="fmap.a">links</str> <str name="fmap.div">ignored_</str> </lst> </requestHandler> <!-- Field Analysis Request Handler RequestHandler that provides much the same functionality as analysis.jsp. Provides the ability to specify multiple field types and field names in the same request and outputs index-time and query-time analysis for each of them. Request parameters are: analysis.fieldname - field name whose analyzers are to be used analysis.fieldtype - field type whose analyzers are to be used analysis.fieldvalue - text for index-time analysis q (or analysis.q) - text for query time analysis analysis.showmatch (true|false) - When set to true and when query analysis is performed, the produced tokens of the field value analysis will be marked as "matched" for every token that is produces by the query analysis --> <requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" /> <!-- Document Analysis Handler http://wiki.apache.org/solr/AnalysisRequestHandler An analysis handler that provides a breakdown of the analysis process of provided documents. This handler expects a (single) content stream with the following format: <docs> <doc> <field name="id">1</field> <field name="name">The Name</field> <field name="text">The Text Value</field> </doc> <doc>...</doc> <doc>...</doc> ... </docs> Note: Each document must contain a field which serves as the unique key. This key is used in the returned response to associate an analysis breakdown to the analyzed document. Like the FieldAnalysisRequestHandler, this handler also supports query analysis by sending either an "analysis.query" or "q" request parameter that holds the query text to be analyzed. It also supports the "analysis.showmatch" parameter which when set to true, all field tokens that match the query tokens will be marked as a "match". --> <requestHandler name="/analysis/document" class="solr.DocumentAnalysisRequestHandler" startup="lazy" /> <!-- Admin Handlers Admin Handlers - This will register all the standard admin RequestHandlers. --> <requestHandler name="/admin/" class="solr.admin.AdminHandlers" /> <!-- This single handler is equivalent to the following... --> <!-- <requestHandler name="/admin/luke" class="solr.admin.LukeRequestHandler" /> <requestHandler name="/admin/system" class="solr.admin.SystemInfoHandler" /> <requestHandler name="/admin/plugins" class="solr.admin.PluginInfoHandler" /> <requestHandler name="/admin/threads" class="solr.admin.ThreadDumpHandler" /> <requestHandler name="/admin/properties" class="solr.admin.PropertiesRequestHandler" /> <requestHandler name="/admin/file" class="solr.admin.ShowFileRequestHandler" > --> <!-- If you wish to hide files under ${solr.home}/conf, explicitly register the ShowFileRequestHandler using the definition below. NOTE: The glob pattern ('*') is the only pattern supported at present, *.xml will not exclude all files ending in '.xml'. Use it to exclude _all_ updates --> <!-- <requestHandler name="/admin/file" class="solr.admin.ShowFileRequestHandler" > <lst name="invariants"> <str name="hidden">synonyms.txt</str> <str name="hidden">anotherfile.txt</str> <str name="hidden">*</str> </lst> </requestHandler> --> <!-- Enabling this request handler (which is NOT a default part of the admin handler) will allow the Solr UI to edit all the config files. This is intended for secure/development use ONLY! Leaving available and publically accessible is a security vulnerability and should be done with extreme caution! --> <!-- ping/healthcheck --> <requestHandler name="/admin/ping" class="solr.PingRequestHandler"> <lst name="invariants"> <str name="q">solrpingquery</str> </lst> <lst name="defaults"> <str name="echoParams">all</str> </lst> <!-- An optional feature of the PingRequestHandler is to configure the handler with a "healthcheckFile" which can be used to enable/disable the PingRequestHandler. relative paths are resolved against the data dir --> <!-- <str name="healthcheckFile">server-enabled.txt</str> --> </requestHandler> <!-- Echo the request contents back to the client --> <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > <lst name="defaults"> <str name="echoParams">explicit</str> <str name="echoHandler">true</str> </lst> </requestHandler> <!-- Solr Replication The SolrReplicationHandler supports replicating indexes from a "master" used for indexing and "slaves" used for queries. http://wiki.apache.org/solr/SolrReplication It is also necessary for SolrCloud to function (in Cloud mode, the replication handler is used to bulk transfer segments when nodes are added or need to recover). https://wiki.apache.org/solr/SolrCloud/ --> <requestHandler name="/replication" class="solr.ReplicationHandler" > <!-- To enable simple master/slave replication, uncomment one of the sections below, depending on whether this solr instance should be the "master" or a "slave". If this instance is a "slave" you will also need to fill in the masterUrl to point to a real machine. --> <!-- <lst name="master"> <str name="replicateAfter">commit</str> <str name="replicateAfter">startup</str> <str name="confFiles">schema.xml,stopwords.txt</str> </lst> --> <!-- <lst name="slave"> <str name="masterUrl">http://your-master-hostname:8983/solr</str> <str name="pollInterval">00:00:60</str> </lst> --> </requestHandler> <!-- Search Components Search components are registered to SolrCore and used by instances of SearchHandler (which can access them by name) By default, the following components are available: <searchComponent name="query" class="solr.QueryComponent" /> <searchComponent name="facet" class="solr.FacetComponent" /> <searchComponent name="mlt" class="solr.MoreLikeThisComponent" /> <searchComponent name="highlight" class="solr.HighlightComponent" /> <searchComponent name="stats" class="solr.StatsComponent" /> <searchComponent name="debug" class="solr.DebugComponent" /> Default configuration in a requestHandler would look like: <arr name="components"> <str>query</str> <str>facet</str> <str>mlt</str> <str>highlight</str> <str>stats</str> <str>debug</str> </arr> If you register a searchComponent to one of the standard names, that will be used instead of the default. To insert components before or after the 'standard' components, use: <arr name="first-components"> <str>myFirstComponentName</str> </arr> <arr name="last-components"> <str>myLastComponentName</str> </arr> NOTE: The component registered with the name "debug" will always be executed after the "last-components" --> <!-- Spell Check The spell check component can return a list of alternative spelling suggestions. http://wiki.apache.org/solr/SpellCheckComponent --> <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <str name="queryAnalyzerFieldType">text_general</str> <!-- Multiple "Spell Checkers" can be declared and used by this component --> <!-- a spellchecker built from a field of the main index --> <lst name="spellchecker"> <str name="name">default</str> <str name="field">text</str> <str name="classname">solr.DirectSolrSpellChecker</str> <!-- the spellcheck distance measure used, the default is the internal levenshtein --> <str name="distanceMeasure">internal</str> <!-- minimum accuracy needed to be considered a valid spellcheck suggestion --> <float name="accuracy">0.5</float> <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 --> <int name="maxEdits">2</int> <!-- the minimum shared prefix when enumerating terms --> <int name="minPrefix">1</int> <!-- maximum number of inspections per result. --> <int name="maxInspections">5</int> <!-- minimum length of a query term to be considered for correction --> <int name="minQueryLength">4</int> <!-- maximum threshold of documents a query term can appear to be considered for correction --> <float name="maxQueryFrequency">0.01</float> <!-- uncomment this to require suggestions to occur in 1% of the documents <float name="thresholdTokenFrequency">.01</float> --> </lst> <!-- a spellchecker that can break or combine words. See "/spell" handler below for usage --> <lst name="spellchecker"> <str name="name">wordbreak</str> <str name="classname">solr.WordBreakSolrSpellChecker</str> <str name="field">name</str> <str name="combineWords">true</str> <str name="breakWords">true</str> <int name="maxChanges">10</int> </lst> <!-- a spellchecker that uses a different distance measure --> <!-- <lst name="spellchecker"> <str name="name">jarowinkler</str> <str name="field">spell</str> <str name="classname">solr.DirectSolrSpellChecker</str> <str name="distanceMeasure"> org.apache.lucene.search.spell.JaroWinklerDistance </str> </lst> --> <!-- a spellchecker that use an alternate comparator comparatorClass be one of: 1. score (default) 2. freq (Frequency first, then score) 3. A fully qualified class name --> <!-- <lst name="spellchecker"> <str name="name">freq</str> <str name="field">lowerfilt</str> <str name="classname">solr.DirectSolrSpellChecker</str> <str name="comparatorClass">freq</str> --> <!-- A spellchecker that reads the list of words from a file --> <!-- <lst name="spellchecker"> <str name="classname">solr.FileBasedSpellChecker</str> <str name="name">file</str> <str name="sourceLocation">spellings.txt</str> <str name="characterEncoding">UTF-8</str> <str name="spellcheckIndexDir">spellcheckerFile</str> </lst> --> </searchComponent> <!-- A request handler for demonstrating the spellcheck component. NOTE: This is purely as an example. The whole purpose of the SpellCheckComponent is to hook it into the request handler that handles your normal user queries so that a separate request is not needed to get suggestions. IN OTHER WORDS, THERE IS REALLY GOOD CHANCE THE SETUP BELOW IS NOT WHAT YOU WANT FOR YOUR PRODUCTION SYSTEM! See http://wiki.apache.org/solr/SpellCheckComponent for details on the request parameters. --> <requestHandler name="/spell" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="df">text</str> <!-- Solr will use suggestions from both the 'default' spellchecker and from the 'wordbreak' spellchecker and combine them. collations (re-written queries) can include a combination of corrections from both spellcheckers --> <str name="spellcheck.dictionary">default</str> <str name="spellcheck.dictionary">wordbreak</str> <str name="spellcheck">on</str> <str name="spellcheck.extendedResults">true</str> <str name="spellcheck.count">10</str> <str name="spellcheck.alternativeTermCount">5</str> <str name="spellcheck.maxResultsForSuggest">5</str> <str name="spellcheck.collate">true</str> <str name="spellcheck.collateExtendedResults">true</str> <str name="spellcheck.maxCollationTries">10</str> <str name="spellcheck.maxCollations">5</str> </lst> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="name">mySuggester</str> <str name="lookupImpl">FuzzyLookupFactory</str> <!-- org.apache.solr.spelling.suggest.fst --> <str name="dictionaryImpl">DocumentDictionaryFactory</str> <!-- org.apache.solr.spelling.suggest.HighFrequencyDictionaryFactory --> <str name="field">cat</str> <str name="weightField">price</str> <str name="suggestAnalyzerFieldType">string</str> </lst> </searchComponent> <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="suggest">true</str> <str name="suggest.count">10</str> </lst> <arr name="components"> <str>suggest</str> </arr> </requestHandler> <!-- Term Vector Component http://wiki.apache.org/solr/TermVectorComponent --> <searchComponent name="tvComponent" class="solr.TermVectorComponent"/> <!-- A request handler for demonstrating the term vector component This is purely as an example. In reality you will likely want to add the component to your already specified request handlers. --> <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="df">text</str> <bool name="tv">true</bool> </lst> <arr name="last-components"> <str>tvComponent</str> </arr> </requestHandler> <!-- Clustering Component You'll need to set the solr.clustering.enabled system property when running solr to run with clustering enabled: java -Dsolr.clustering.enabled=true -jar start.jar http://wiki.apache.org/solr/ClusteringComponent http://carrot2.github.io/solr-integration-strategies/ --> <searchComponent name="clustering" enable="${solr.clustering.enabled:false}" class="solr.clustering.ClusteringComponent" > <lst name="engine"> <str name="name">lingo</str> <!-- Class name of a clustering algorithm compatible with the Carrot2 framework. Currently available open source algorithms are: * org.carrot2.clustering.lingo.LingoClusteringAlgorithm * org.carrot2.clustering.stc.STCClusteringAlgorithm * org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm See http://project.carrot2.org/algorithms.html for more information. A commercial algorithm Lingo3G (needs to be installed separately) is defined as: * com.carrotsearch.lingo3g.Lingo3GClusteringAlgorithm --> <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str> <!-- Override location of the clustering algorithm's resources (attribute definitions and lexical resources). A directory from which to load algorithm-specific stop words, stop labels and attribute definition XMLs. For an overview of Carrot2 lexical resources, see: http://download.carrot2.org/head/manual/#chapter.lexical-resources For an overview of Lingo3G lexical resources, see: http://download.carrotsearch.com/lingo3g/manual/#chapter.lexical-resources --> <str name="carrot.resourcesDir">clustering/carrot2</str> </lst> <!-- An example definition for the STC clustering algorithm. --> <lst name="engine"> <str name="name">stc</str> <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str> </lst> <!-- An example definition for the bisecting kmeans clustering algorithm. --> <lst name="engine"> <str name="name">kmeans</str> <str name="carrot.algorithm">org.carrot2.clustering.kmeans.BisectingKMeansClusteringAlgorithm</str> </lst> </searchComponent> <!-- A request handler for demonstrating the clustering component This is purely as an example. In reality you will likely want to add the component to your already specified request handlers. --> <requestHandler name="/clustering" startup="lazy" enable="${solr.clustering.enabled:false}" class="solr.SearchHandler"> <lst name="defaults"> <bool name="clustering">true</bool> <bool name="clustering.results">true</bool> <!-- Field name with the logical "title" of a each document (optional) --> <str name="carrot.title">name</str> <!-- Field name with the logical "URL" of a each document (optional) --> <str name="carrot.url">id</str> <!-- Field name with the logical "content" of a each document (optional) --> <str name="carrot.snippet">features</str> <!-- Apply highlighter to the title/ content and use this for clustering. --> <bool name="carrot.produceSummary">true</bool> <!-- the maximum number of labels per cluster --> <!--<int name="carrot.numDescriptions">5</int>--> <!-- produce sub clusters --> <bool name="carrot.outputSubClusters">false</bool> <!-- Configure the remaining request handler parameters. --> <str name="defType">edismax</str> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> <str name="q.alt">*:*</str> <str name="rows">10</str> <str name="fl">*,score</str> </lst> <arr name="last-components"> <str>clustering</str> </arr> </requestHandler> <!-- Terms Component http://wiki.apache.org/solr/TermsComponent A component to return terms and document frequency of those terms --> <searchComponent name="terms" class="solr.TermsComponent"/> <!-- A request handler for demonstrating the terms component --> <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <bool name="terms">true</bool> <bool name="distrib">false</bool> </lst> <arr name="components"> <str>terms</str> </arr> </requestHandler> <!-- Query Elevation Component http://wiki.apache.org/solr/QueryElevationComponent a search component that enables you to configure the top results for a given query regardless of the normal lucene scoring. --> <searchComponent name="elevator" class="solr.QueryElevationComponent" > <!-- pick a fieldType to analyze queries --> <str name="queryFieldType">string</str> <str name="config-file">elevate.xml</str> </searchComponent> <!-- A request handler for demonstrating the elevator component --> <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="df">text</str> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <!-- Highlighting Component http://wiki.apache.org/solr/HighlightingParameters --> <searchComponent class="solr.HighlightComponent" name="highlight"> <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" default="true" class="solr.highlight.GapFragmenter"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (for sentence extraction) --> <fragmenter name="regex" class="solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" default="true" class="solr.highlight.HtmlFormatter"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> <!-- Configure the standard encoder --> <encoder name="html" class="solr.highlight.HtmlEncoder" /> <!-- Configure the standard fragListBuilder --> <fragListBuilder name="simple" class="solr.highlight.SimpleFragListBuilder"/> <!-- Configure the single fragListBuilder --> <fragListBuilder name="single" class="solr.highlight.SingleFragListBuilder"/> <!-- Configure the weighted fragListBuilder --> <fragListBuilder name="weighted" default="true" class="solr.highlight.WeightedFragListBuilder"/> <!-- default tag FragmentsBuilder --> <fragmentsBuilder name="default" default="true" class="solr.highlight.ScoreOrderFragmentsBuilder"> <!-- <lst name="defaults"> <str name="hl.multiValuedSeparatorChar">/</str> </lst> --> </fragmentsBuilder> <!-- multi-colored tag FragmentsBuilder --> <fragmentsBuilder name="colored" class="solr.highlight.ScoreOrderFragmentsBuilder"> <lst name="defaults"> <str name="hl.tag.pre"><![CDATA[ <b style="background:yellow">,<b style="background:lawgreen">, <b style="background:aquamarine">,<b style="background:magenta">, <b style="background:palegreen">,<b style="background:coral">, <b style="background:wheat">,<b style="background:khaki">, <b style="background:lime">,<b style="background:deepskyblue">]]></str> <str name="hl.tag.post"><![CDATA[</b>]]></str> </lst> </fragmentsBuilder> <boundaryScanner name="default" default="true" class="solr.highlight.SimpleBoundaryScanner"> <lst name="defaults"> <str name="hl.bs.maxScan">10</str> <str name="hl.bs.chars">.,!? </str> </lst> </boundaryScanner> <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner"> <lst name="defaults"> <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE --> <str name="hl.bs.type">WORD</str> <!-- language and country are used when constructing Locale object. --> <!-- And the Locale object will be used when getting instance of BreakIterator --> <str name="hl.bs.language">en</str> <str name="hl.bs.country">US</str> </lst> </boundaryScanner> </highlighting> </searchComponent> <!-- Update Processors Chains of Update Processor Factories for dealing with Update Requests can be declared, and then used by name in Update Request Processors http://wiki.apache.org/solr/UpdateRequestProcessor --> <!-- Deduplication An example dedup update processor that creates the "id" field on the fly based on the hash code of some other fields. This example has overwriteDupes set to false since we are using the id field as the signatureField and Solr will maintain uniqueness based on that anyway. --> <!-- <updateRequestProcessorChain name="dedupe"> <processor class="solr.processor.SignatureUpdateProcessorFactory"> <bool name="enabled">true</bool> <str name="signatureField">id</str> <bool name="overwriteDupes">false</bool> <str name="fields">name,features,cat</str> <str name="signatureClass">solr.processor.Lookup3Signature</str> </processor> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> --> <!-- Language identification This example update chain identifies the language of the incoming documents using the langid contrib. The detected language is written to field language_s. No field name mapping is done. The fields used for detection are text, title, subject and description, making this example suitable for detecting languages form full-text rich documents injected via ExtractingRequestHandler. See more about langId at http://wiki.apache.org/solr/LanguageDetection --> <!-- <updateRequestProcessorChain name="langid"> <processor class="org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory"> <str name="langid.fl">text,title,subject,description</str> <str name="langid.langField">language_s</str> <str name="langid.fallback">en</str> </processor> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> --> <!-- Script update processor This example hooks in an update processor implemented using JavaScript. See more about the script update processor at http://wiki.apache.org/solr/ScriptUpdateProcessor --> <!-- <updateRequestProcessorChain name="script"> <processor class="solr.StatelessScriptUpdateProcessorFactory"> <str name="script">update-script.js</str> <lst name="params"> <str name="config_param">example config parameter</str> </lst> </processor> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> --> <!-- Response Writers http://wiki.apache.org/solr/QueryResponseWriter Request responses will be written using the writer specified by the 'wt' request parameter matching the name of a registered writer. The "default" writer is the default and will be used if 'wt' is not specified in the request. --> <!-- The following response writers are implicitly configured unless overridden... --> <!-- <queryResponseWriter name="xml" default="true" class="solr.XMLResponseWriter" /> <queryResponseWriter name="json" class="solr.JSONResponseWriter"/> <queryResponseWriter name="python" class="solr.PythonResponseWriter"/> <queryResponseWriter name="ruby" class="solr.RubyResponseWriter"/> <queryResponseWriter name="php" class="solr.PHPResponseWriter"/> <queryResponseWriter name="phps" class="solr.PHPSerializedResponseWriter"/> <queryResponseWriter name="csv" class="solr.CSVResponseWriter"/> <queryResponseWriter name="schema.xml" class="solr.SchemaXmlResponseWriter"/> --> <queryResponseWriter name="json" class="solr.JSONResponseWriter"> <!-- For the purposes of the tutorial, JSON responses are written as plain text so that they are easy to read in *any* browser. If you expect a MIME type of "application/json" just remove this override. --> <str name="content-type">text/plain; charset=UTF-8</str> </queryResponseWriter> <!-- Custom response writers can be declared as needed... --> <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" startup="lazy"/> <!-- XSLT response writer transforms the XML output by any xslt file found in Solr's conf/xslt directory. Changes to xslt files are checked for every xsltCacheLifetimeSeconds. --> <queryResponseWriter name="xslt" class="solr.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter> <!-- Query Parsers http://wiki.apache.org/solr/SolrQuerySyntax Multiple QParserPlugins can be registered by name, and then used in either the "defType" param for the QueryComponent (used by SearchHandler) or in LocalParams --> <!-- example of registering a query parser --> <!-- <queryParser name="myparser" class="com.mycompany.MyQParserPlugin"/> --> <!-- Function Parsers http://wiki.apache.org/solr/FunctionQuery Multiple ValueSourceParsers can be registered by name, and then used as function names when using the "func" QParser. --> <!-- example of registering a custom function parser --> <!-- <valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" /> --> <!-- Document Transformers http://wiki.apache.org/solr/DocTransformers --> <!-- Could be something like: <transformer name="db" class="com.mycompany.LoadFromDatabaseTransformer" > <int name="connection">jdbc://....</int> </transformer> To add a constant value to all docs, use: <transformer name="mytrans2" class="org.apache.solr.response.transform.ValueAugmenterFactory" > <int name="value">5</int> </transformer> If you want the user to still be able to change it with _value:something_ use this: <transformer name="mytrans3" class="org.apache.solr.response.transform.ValueAugmenterFactory" > <double name="defaultValue">5</double> </transformer> If you are using the QueryElevationComponent, you may wish to mark documents that get boosted. The EditorialMarkerFactory will do exactly that: <transformer name="qecBooster" class="org.apache.solr.response.transform.EditorialMarkerFactory" /> --> <!-- Legacy config for the admin interface --> <admin> <defaultQuery>*:*</defaultQuery> </admin> </config> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | show dbs use mydb db help j = { name : "mongo" } k = { x : 3 } db.testData.insert( j ) db.testData.insert( k ) show collections db.testData.find() var c = db.testData.find() while ( c.hasNext() ) printjson( c.next() ) var c = db.testData.find() printjson( c [ 1 ] ) db.testData.find( { x : 3 } ) db.testData.findOne() db.testData.find().limit(3) |
1 2 3 | for (var i = 1; i <= 25; i++) db.testData.insert( { x : i } )
db.testData.find()
|
MongoDB query operation the same query in SQL
Projections
"_id": 0, "name": 1 , "email": 1 => select name, email
Query Behavior
MongoDB queries exhibit the following behavior:
All queries in MongoDB address a single collection. You can modify the query to impose limits, skips, and sort orders. The order of documents returned by a query is not defined unless you specify a sort(). Operations that modify existing documents (i.e. updates) use the same query syntax as queries to select documents to update. In aggregation pipeline, the $match pipeline stage provides access to MongoDB queries. MongoDB provides a db.collection.findOne() method as a special case of find() that returns a single document.
Cursor Information
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | You can use the command cursorInfo to retrieve the following information on cursors: total number of open cursors size of the client cursors in current use number of timed out cursors since the last server restart Consider the following example: db.runCommand( { cursorInfo: 1 } ) The result from the command returns the following document: { "totalOpen" : <number>, "clientCursors_size" : <number>, "timedOut" : <number>, "ok" : 1 } |
Analyze Query Performance To use the explain() method, call the method on a cursor returned by find().
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | db.inventory.find( { type: 'food' } ).explain() { "cursor" : "BtreeCursor type_1", "isMultiKey" : false, "n" : 5, "nscannedObjects" : 5, "nscanned" : 5, "nscannedObjectsAllPlans" : 5, "nscannedAllPlans" : 5, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 0, "indexBounds" : { "type" : [ [ "food", "food" ] ] }, "server" : "mongodbo0.example.net:27017" } |
With an upsert, applications can decide between performing an update or an insert operation using just a single call. Both the update() method and the save() method can perform an upsert. See update() and save() for details on performing an upsert with these methods.
SQL to MongoDB Mapping Chart for additional examples of MongoDB write operations and the corresponding SQL statements. !!!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | db.inventory.insert( { _id: 10, type: "misc", item: "card", qty: 15 } ) db.inventory.update( { type: "book", item : "journal" }, { $set : { qty: 10 } }, { upsert : true } ) db.inventory.save( { type: "book", item: "notebook", qty: 40 } ) db.inventory.find( { type: "snacks" } ) db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } ) db.inventory.find( { type: 'food', price: { $lt: 9.95 } } ) db.inventory.find( { $or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] } ) |
Specify AND as well as OR Conditions
With additional clauses, you can specify precise conditions for matching documents.
In the following example, the compound query document selects all documents in the collection where the value of the type field is 'food' and either the qty has a value greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95:
1 2 3 | db.inventory.find( { type: 'food', $or: [ { qty: { $gt: 100 } }, { price: { $lt: 9.95 } } ] } ) |
Details http://docs.mongodb.org/manual/tutorial/query-documents/
Remove Documents
Remove All Documents The following example removes all documents from the inventory collection:
db.inventory.remove()
Remove Documents that Matches a Condition
db.inventory.remove( { type : "food" } )
Remove a Single Document that Matches a Condition To remove a single document, call the remove() method with the justOne parameter set to true or 1.
db.inventory.remove( { type : "food" }, 1 )
Query Cursor Methods
1 2 3 4 5 6 7 8 9 | Name Description cursor.count() Returns a count of the documents in a cursor. cursor.explain() Reports on the query execution plan, including index use, for a cursor. cursor.hint() Forces MongoDB to use a specific index for a query. cursor.limit() Constrains the size of a cursor’s result set. cursor.next() Returns the next document in a cursor. cursor.skip() Returns a cursor that begins returning results only after passing or skipping a number of documents. cursor.sort() Returns results ordered according to a sort specification. cursor.toArray() Returns an array that contains all documents returned by the cursor. |
Query and Data Manipulation Collection Methods
1 2 3 4 5 6 7 8 9 | Name Description db.collection.count() Wraps count to return a count of the number of documents in a collection or matching a query. db.collection.distinct() Returns an array of documents that have distinct values for the specified field. db.collection.find() Performs a query on a collection and returns a cursor object. db.collection.findOne() Performs a query and returns a single document. db.collection.insert() Creates a new document in a collection. db.collection.remove() Deletes documents from a collection. db.collection.save() Provides a wrapper around an insert() and update() to insert new documents. db.collection.update() Modifies a document in a collection. |
DBrefs
个人觉得:MongoDB 是以key-value模式进行存储的,所以是不建议复杂关联关系的。 Mongodb联合查询 MongoDB的多表关联操作
ObjectId
ObjectId is a 12-byte BSON type, constructed using:
1 2 3 4 | a 4-byte value representing the seconds since the Unix epoch, a 3-byte machine identifier, a 2-byte process id, and a 3-byte counter, starting with a random value. |
ObjectId 是一个很有意思的东东,
> ObjectId("507c7f79bcf86cd7994f6c0e").getTimestamp() ISODate("2012-10-15T21:26:17Z")
- 1 in the mongo shell, you can access the creation time of the ObjectId, using the getTimestamp() method.
- 2 sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time.
Enable Authentication
Security > Security Tutorials > Access Control Tutorials > Enable Authentication
MongoDB Drivers and Client Libraries
突然发现还有个Mongo training呢 使用MMS(MongoDB Monitoring Service)监控MongoDB
]]>]]>
1 2 3 4 5 6 7 8 9 10 11 12 | 1 Download mongodb-linux-i686-2.4.9.tgz 2 tar xvf mongodb-linux-i686-2.4.9.tgz 3 mkdir -p /data/db/ //默认会在/data/db/存储数据库文件 4 chown `id -u` /data/db **5 chown -R mongodb:mongodb db/** 这个可重要了 6 ./mongod OR 1 sudo apt-get install mongodb-10gen Installing new version of config file /etc/mongodb.conf ... Installing new version of config file /etc/init/mongodb.conf ... |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | $mongod ./mongod --help for help and startup options Mon Mar 3 16:27:50.735 Mon Mar 3 16:27:50.738 warning: 32-bit servers don't have journaling enabled by default. Please use --journal if you want durability. Mon Mar 3 16:27:50.739 Mon Mar 3 16:27:51.037 [initandlisten] MongoDB starting : pid=19039 port=27017 dbpath=/data/db/ 32-bit host=ubuntu Mon Mar 3 16:27:51.039 [initandlisten] Mon Mar 3 16:27:51.040 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary. Mon Mar 3 16:27:51.041 [initandlisten] ** 32 bit builds are limited to less than 2GB of data (or less with --journal). Mon Mar 3 16:27:51.042 [initandlisten] ** Note that journaling defaults to off for 32 bit and is currently off. Mon Mar 3 16:27:51.042 [initandlisten] ** See http://dochub.mongodb.org/core/32bit Mon Mar 3 16:27:51.044 [initandlisten] Mon Mar 3 16:27:51.047 [initandlisten] db version v2.4.9 Mon Mar 3 16:27:51.047 [initandlisten] git version: 52fe0d21959e32a5bdbecdc62057db386e4e029c Mon Mar 3 16:27:51.047 [initandlisten] build info: Linux bs-linux32.10gen.cc 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:39:36 EST 2008 i686 BOOST_LIB_VERSION=1_49 Mon Mar 3 16:27:51.048 [initandlisten] allocator: system Mon Mar 3 16:27:51.048 [initandlisten] options: {} Mon Mar 3 16:27:51.271 [FileAllocator] allocating new datafile /data/db/local.ns, filling with zeroes... Mon Mar 3 16:27:51.273 [FileAllocator] creating directory /data/db/_tmp Mon Mar 3 16:27:51.355 [FileAllocator] done allocating datafile /data/db/local.ns, size: 16MB, took 0.045 secs Mon Mar 3 16:27:51.357 [FileAllocator] allocating new datafile /data/db/local.0, filling with zeroes... Mon Mar 3 16:27:51.432 [FileAllocator] done allocating datafile /data/db/local.0, size: 16MB, took 0.002 secs Mon Mar 3 16:27:51.510 [initandlisten] command local.$cmd command: { create: "startup_log", size: 10485760, capped: true } ntoreturn:1 keyUpdates:0 reslen:37 239ms Mon Mar 3 16:27:51.520 [websvr] admin web console waiting for connections on port 28017 Mon Mar 3 16:27:51.522 [initandlisten] waiting for connections on port 27017 |
另起console
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | $mongo MongoDB shell version: 2.4.9 connecting to: test Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user Server has startup warnings: Mon Mar 3 16:39:22.629 [initandlisten] Mon Mar 3 16:39:22.631 [initandlisten] ** NOTE: This is a 32 bit MongoDB binary. Mon Mar 3 16:39:22.631 [initandlisten] ** 32 bit builds are limited to less than 2GB of data (or less with --journal). Mon Mar 3 16:39:22.632 [initandlisten] ** See http://dochub.mongodb.org/core/32bit Mon Mar 3 16:39:22.633 [initandlisten] > |
以下这个是要自己写,要是应用apt-get install 的话是不需要的。
/etc/init.d/mongodb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | #!/bin/sh ### BEGIN INIT INFO # Provides: mongodb # Required-Start: # Required-Stop: # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: mongodb # Description: mongo db server ### END INIT INFO . /lib/lsb/init-functions PROGRAM=/tools/mongodb-linux-i686-2.4.9/bin/mongod MONGOPID=`ps -ef | grep 'mongod' | grep -v grep | awk '{print $2}'` test -x $PROGRAM || exit 0 case "$1" in start) ulimit -n 3000 log_begin_msg "Starting MongoDB server" $PROGRAM --fork --quiet -journal -maxConns=2400 -rest --logpath /data/db/journal/mongdb.log log_end_msg 0 ;; stop) log_begin_msg "Stopping MongoDB server" if [ ! -z "$MONGOPID" ]; then kill -15 $MONGOPID fi log_end_msg 0 ;; status) ;; *) log_success_msg "Usage: /etc/init.d/mongodb {start|stop|status}" exit 1 esac exit 0 |
关闭/启动服务
1 2 3 4 5 | sudo service mongodb stop
sudo service mongodb start
`http://192.168.6.129:27017/`
You are trying to access MongoDB on the native driver port. For http diagnostic access, add 1000 to the port number
|
手动粗暴启动mongodb server
1 2 3 4 | ./mongod -journal -maxConns=2400 -rest -journal 代表要写日志,-maxConns=2400代表mongodb 可以接受2400个tcp连接,-rest代表可以允许客户端通过rest API访问mongdb server. 还可以使用参数—quiet启动可以指定安静模式减少记录的项目数,注意使用该参数必须要同时指定日志路径,比如: —quiet —logpath /data/db/journal/mongdb.log |
修改系统允许的最大连接数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | 上面的最大连接数目的限制原因是Linux系统默认一个进程最大文件打开数目为1024,用ulimit -a 命令检查,可以看到下面这行: open files (-n) 1024 修改/etc/security/limits.conf 配置文件。 使用命令:sudo gedit /etc/security/limits.conf 在文件中增加 * soft nofile 3000 * hard nofile 20000 root soft nofile 3000 root hard nofile 20000 * 表示该配置对所有用户均有效,root用户要特别加两行。 硬限制通常是根据系统硬件资源状况(主要是系统内存)计算出来的系统最多可同时打开的文件数量,软限制是在这个基础上进一步的限制。因此软限制数目要低于硬限制。 nofile表示 max number of open files 重新启动计算机,然后再用ulimit -a 命令查看: open files (-n) 3000 已经生效了。现在再启动mongodb server,问题解决 |
启动日志如上,服务端启动如上,现在我们在另外一个终端测试服务器是否正常。 进入/usr/local/mongodb-linux-x86_64-2.0.2/bin,执行./mongo 出现
1 2 | MongoDB shell version: 2.0.2 connecting to: test |
执行
1 2 3 | $db.foo.save({1 : “Hello world”}) $db.foo.find(); { "_id" : ObjectId("4e4b395986738efa2d0718b9"), "1" : "hello world" } |
执行到这里恭喜你,成功安装好了mongodb
也可以通过下面这种方式连接远程的mongodb server,默认端口为27017,比如 ./mongo 192.168.6.129
如果没有mydb数据库的话,在客户端中使用命令: use mydb 将创建mydb数据库,而且当前数据库切换为mydb. 此时show dbs不显示该数据库名称。使用db.stats()命令检查当前数据库状态。
1 | $db 当前数据库
|
MongoVUE is an innovative MongoDB desktop application for Windows OS that gives you an elegant and highly usable GUI interface to work with MongoDB. Now there is one less worry in managing your web-scale data.
Connection Congfig
1 2 | Server: 192.168.6.129 Prot: 27017 |
以服务器端产生伪随机数防范,该方法也可用于防止表单多次提交~
CSRF(Cross-site request forgery),中文名称:跨站请求伪造,也被称为:one click attack/session riding,缩写为:CSRF/XSRF。
你这可以这么理解CSRF攻击:攻击者盗用了你的身份,以你的名义发送恶意请求。CSRF能够做的事情包括:以你名义发送邮件,发消息,盗取你的账号,甚至于购买商品,虚拟货币转账......造成的问题包括:个人隐私泄露以及财产安全。
CSRF这种攻击方式在2000年已经被国外的安全人员提出,但在国内,直到06年才开始被关注,08年,国内外的多个大型社区和交互网站分别爆出CSRF漏洞,如:NYTimes.com(纽约时报)、Metafilter(一个大型的BLOG网站),YouTube和百度HI......而现在,互联网上的许多站点仍对此毫无防备,以至于安全业界称CSRF为“沉睡的巨人”。
下图简单阐述了CSRF攻击的思想:
从上图可以看出,要完成一次CSRF攻击,受害者必须依次完成两个步骤:
1.登录受信任网站A,并在本地生成Cookie。
2.在不登出A的情况下,访问危险网站B。
看到这里,你也许会说:“如果我不满足以上两个条件中的一个,我就不会受到CSRF的攻击”。是的,确实如此,但你不能保证以下情况不会发生:
1.你不能保证你登录了一个网站后,不再打开一个tab页面并访问另外的网站。
2.你不能保证你关闭浏览器了后,你本地的Cookie立刻过期,你上次的会话已经结束。(事实上,关闭浏览器不能结束一个会话,但大多数人都会错误的认为关闭浏览器就等于退出登录/结束会话了......)
3.上图中所谓的攻击网站,可能是一个存在其他漏洞的可信任的经常被人访问的网站。
其实可以看出,CSRF攻击是源于WEB的隐式身份验证机制!WEB的身份验证机制虽然可以保证一个请求是来自于某个用户的浏览器,但却无法保证该请求是用户批准发送的!
服务端的CSRF方式方法很多样,但总的思想都是一致的,就是在客户端页面增加伪随机数。
(1).Cookie Hashing(所有表单都包含同一个伪随机值):
这可能是最简单的解决方案了,因为攻击者不能获得第三方的Cookie(理论上),所以表单中的数据也就构造失败了:>
这个方法个人觉得已经可以杜绝99%的CSRF攻击了,那还有1%呢....由于用户的Cookie很容易由于网站的XSS漏洞而被盗取,这就另外的1%。一般的攻击者看到有需要算Hash值,基本都会放弃了,某些除外,所以如果需要100%的杜绝,这个不是最好的方法。
(2).验证码
这个方案的思路是:每次的用户提交都需要用户在表单中填写一个图片上的随机字符串,厄....这个方案可以完全解决CSRF,但个人觉得在易用性方面似乎不是太好,还有听闻是验证码图片的使用涉及了一个被称为MHTML的Bug,可能在某些版本的微软IE中受影响。
(3).One-Time Tokens(不同的表单包含一个不同的伪随机值)
在实现One-Time Tokens时,需要注意一点:就是“并行会话的兼容”。如果用户在一个站点上同时打开了两个不同的表单,CSRF保护措施不应该影响到他对任何表单的提交。考虑一下如果每次表单被装入时站点生成一个伪随机值来覆盖以前的伪随机值将会发生什么情况:用户只能成功地提交他最后打开的表单,因为所有其他的表单都含有非法的伪随机值。必须小心操作以确保CSRF保护措施不会影响选项卡式的浏览或者利用多个浏览器窗口浏览一个站点。
]]>以服务器端产生伪随机数防范,该方法也可用于防止表单多次提交~
]]>典型的消费-生产者模型。 应用:对无需即时返回且耗时的操作,进行异步处理。
RabbitMQ的结构图如下:
Broker:简单来说就是消息队列服务器实体。 Exchange:消息交换机,它指定消息按什么规则,路由到哪个队列。 Queue:消息队列载体,每个消息都会被投入到一个或多个队列。 Binding:绑定,它的作用就是把exchange和queue按照路由规则绑定起来。 Routing Key:路由关键字,exchange根据这个关键字进行消息投递。 vhost:虚拟主机,一个broker里可以开设多个vhost,用作不同用户的权限分离。 producer:消息生产者,就是投递消息的程序。 consumer:消息消费者,就是接受消息的程序。 channel:消息通道,在客户端的每个连接里,可建立多个channel,每个channel代表一个会话任务。
1 | $sudo apt-get install erlang-nox |
1 | $sudo dpkg -i rabbitmq-server_3.2.3-1_all.deb |
1 2 3 4 5 6 | /etc/init.d/rabbitmq-server start * Starting message broker rabbitmq-server [ OK ] /etc/init.d/rabbitmq-server stop * Stopping message broker rabbitmq-server [ OK ] /etc/init.d/rabbitmq-server restart * Restarting message broker rabbitmq-server [ OK ] |
RABBITMQ_NODE_PORT:5672
Configuration rabbitmq-env.cof
RabbitMQ启动参数具体含义, 偶还没仔细看
1 2 3 | $rabbitmq-plugins enable rabbitmq_management $ls /etc/rabbitmq $/etc/init.d/rabbitmq-server restart |
查看:http://localhost:15672
用户名/密码:guest/guest
1 2 3 4 5 6 7 8 9 10 | 1.添加用户 $rabbitmqctl add_user username password 2.删除用户 $rabbitmqctl delete_user username 3.修改密码 $rabbitmqctl change_password username newpassword 4.列出所有用户 $rabbitmqctl list_users 5.用户赋权 $rabbitmqctl set_user_tags newuser administrator |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | 1. 查看所有插件 $rabbitmq-plugins list [e] amqp_client 3.2.3 [ ] cowboy 0.5.0-rmq3.2.3-git4b93c2d [ ] eldap 3.2.3-gite309de4 [e] mochiweb 2.7.0-rmq3.2.3-git680dba8 [ ] rabbitmq_amqp1_0 3.2.3 [ ] rabbitmq_auth_backend_ldap 3.2.3 [ ] rabbitmq_auth_mechanism_ssl 3.2.3 [ ] rabbitmq_consistent_hash_exchange 3.2.3 [ ] rabbitmq_federation 3.2.3 [ ] rabbitmq_federation_management 3.2.3 [ ] rabbitmq_jsonrpc 3.2.3 [ ] rabbitmq_jsonrpc_channel 3.2.3 [ ] rabbitmq_jsonrpc_channel_examples 3.2.3 [E] rabbitmq_management 3.2.3 [e] rabbitmq_management_agent 3.2.3 [ ] rabbitmq_management_visualiser 3.2.3 [ ] rabbitmq_mqtt 3.2.3 [ ] rabbitmq_shovel 3.2.3 [ ] rabbitmq_shovel_management 3.2.3 [ ] rabbitmq_stomp 3.2.3 [ ] rabbitmq_tracing 3.2.3 [e] rabbitmq_web_dispatch 3.2.3 [ ] rabbitmq_web_stomp 3.2.3 [ ] rabbitmq_web_stomp_examples 3.2.3 [ ] rfc4627_jsonrpc 3.2.3-git5e67120 [ ] sockjs 0.3.4-rmq3.2.3-git3132eb9 [e] webmachine 1.10.3-rmq3.2.3-gite9359c7 2. 激活插件 $rabbitmq-plugins enable rabbitmq_management 3. 关闭插件 $rabbitmq-plugins disable rabbitmq_management |
代码来源: RabbitMQ 入门指南(Java)
依赖Jar: amqp-client-3.0.4.jar, commons-lang-2.6.jar
EndPoint.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | package examples.rabbitmq; import java.io.IOException; import com.rabbitmq.client.Channel; import com.rabbitmq.client.Connection; import com.rabbitmq.client.ConnectionFactory; public abstract class EndPoint { protected Channel channel; protected Connection connection; protected String endPointName; public EndPoint(String endpointName) throws IOException { this.endPointName = endpointName; // Create a connection factory ConnectionFactory factory = new ConnectionFactory(); // hostname of your rabbitmq server factory.setHost("192.168.6.129"); // getting a connection connection = factory.newConnection(); // creating a channel channel = connection.createChannel(); // declaring a queue for this channel. If queue does not exist, // it will be created on the server. channel.queueDeclare(endpointName, false, false, false, null); } /** * 关闭channel和connection。并非必须,因为隐含是自动调用的。 * * @throws IOException */ public void close() throws IOException { this.channel.close(); this.connection.close(); } } |
Producer.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | package examples.rabbitmq; import java.io.IOException; import java.io.Serializable; import org.apache.commons.lang.SerializationUtils; /** * The producer endpoint that writes to the queue. * @author syntx * */ public class Producer extends EndPoint{ public Producer(String endPointName) throws IOException{ super(endPointName); } public void sendMessage(Serializable object) throws IOException { channel.basicPublish("",endPointName, null, SerializationUtils.serialize(object)); } } |
QueueConsumer.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | package examples.rabbitmq; import java.io.IOException; import java.util.HashMap; import java.util.Map; import org.apache.commons.lang.SerializationUtils; import com.rabbitmq.client.AMQP.BasicProperties; import com.rabbitmq.client.Consumer; import com.rabbitmq.client.Envelope; import com.rabbitmq.client.ShutdownSignalException; /** * 读取队列的程序端,实现了Runnable接口。 * @author syntx * */ public class QueueConsumer extends EndPoint implements Runnable, Consumer{ public QueueConsumer(String endPointName) throws IOException{ super(endPointName); } public void run() { try { //start consuming messages. Auto acknowledge messages. channel.basicConsume(endPointName, true,this); } catch (IOException e) { e.printStackTrace(); } } /** * Called when consumer is registered. */ public void handleConsumeOk(String consumerTag) { System.out.println("Consumer "+consumerTag +" registered"); } /** * Called when new message is available. */ public void handleDelivery(String consumerTag, Envelope env, BasicProperties props, byte[] body) throws IOException { Map map = (HashMap)SerializationUtils.deserialize(body); System.out.println("Message Number "+ map.get("message number") + " received."); } public void handleCancel(String consumerTag) {} public void handleCancelOk(String consumerTag) {} public void handleRecoverOk(String consumerTag) {} public void handleShutdownSignal(String consumerTag, ShutdownSignalException arg1) {} } |
Main.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | package examples.rabbitmq; import java.io.IOException; import java.sql.SQLException; import java.util.HashMap; public class Main { public Main() throws Exception{ QueueConsumer consumer = new QueueConsumer("queue"); Thread consumerThread = new Thread(consumer); consumerThread.start(); Producer producer = new Producer("queue"); for (int i = 0; i < 100000; i++) { HashMap message = new HashMap(); message.put("message number", i); producer.sendMessage(message); System.out.println("Message Number "+ i +" sent."); } } /** * @param args * @throws SQLException * @throws IOException */ public static void main(String[] args) throws Exception{ new Main(); } } |
消息队列RabbitMQ入门介绍 可扩展Web架构与分布式系统 想看来着,但是文章实在是太长了 -_-! RabbitMQ之消息发布订阅与信息持久化技术 RabbitMQ启动参数具体含义 rabbitmq-service用户手册 RabbitMQ的安装,配置,监控
]]>1 orcale 的 VBox 很烂,再一次挂掉,不能忍啊。 卸掉 2 VMWare 还要注册 。。。 溜掉 3 VMWare Player Good
VMWare Player Download Link -> ubuntu iso ->
1 2 3 | sudo apt-get install openssh-server Try: ssh <username>@<remote ip> |
主要参考: link1, mmonit doc link
以下来源link1
1 2 3 4 5 6 7 8 | sudo apt-get install monit --- sudo apt-get remove monit /var/monit/monitrc //配置文件 sudo /etc/init.d/monit start sudo /etc/init.d/monit stop sudo /etc/init.d/monit restart |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | ## ## 示例monit配置文件,说明: ## 1. 域名以example.com为例。 ## 2. 后面带xxx的均是举例用的名字,需要根据自己的需要修改。 ## ############################################################################### ## Monit control file ############################################################################### # # 检查周期,默认为2分钟,对于网站来说有点长,可以根据需要自行调节,这改成30秒。 set daemon 30 # 日志文件 set logfile /var/log/monit.log # # 邮件通知服务器 # #set mailserver mail.example.com set mailserver localhost # # 通知邮件的格式设置,下面是默认格式供参考 # ## Monit by default uses the following alert mail format: ## ## --8<-- ## From: monit@$HOST # sender ## Subject: monit alert -- $EVENT $SERVICE # subject ## ## $EVENT Service $SERVICE # ## # ## Date: $DATE # ## Action: $ACTION # ## Host: $HOST # body ## Description: $DESCRIPTION # ## # ## Your faithful employee, # ## monit # ## --8<-- ## ## You can override the alert message format or its parts such as subject ## or sender using the MAIL-FORMAT statement. Macros such as $DATE, etc. ## are expanded on runtime. For example to override the sender: # # 简单的,这只改了一下发送人,有需要可以自己修改其它内容。 set mail-format { from: webmaster@example.com } # 设置邮件通知接收者。建议发到gmail,方便邮件过滤。 set alert userxxx@gmail.com set httpd port 2812 and # 设置http监控页面的端口 use address www.example.com # http监控页面的IP或域名 allow localhost # 允许本地访问 allow 58.68.78.0/24 # 允许此IP段访问 ##allow 0.0.0.0/0.0.0.0 # 允许任何IP段,不建议这样干 allow userxxx:passwordxxx # 访问用户名密码 ############################################################################### ## Services ############################################################################### # # 系统整体运行状况监控,默认的就可以,可以自己去微调 # # 系统名称,可以是IP或域名 check system www.example.com if loadavg (1min) > 4 then alert if loadavg (5min) > 2 then alert if memory usage > 75% then alert if cpu usage (user) > 70% then alert if cpu usage (system) > 30% then alert if cpu usage (wait) > 20% then alert # # 监控nginx # # 需要提供进程pid文件信息 check process nginx with pidfile /var/run/nginx.pid # 进程启动命令行,注:必须是命令全路径 start program = "/etc/init.d/nginx start" # 进程关闭命令行 stop program = "/etc/init.d/nginx stop" # nginx进程状态测试,监测到nginx连不上了,则自动重启 if failed host www.example.com port 80 protocol http then restart # 多次重启失败将不再尝试重启,这种就是系统出现严重错误的情况 if 3 restarts within 5 cycles then timeout # 可选,设置分组信息 group server # 可选的ssl端口的监控,如果有的话 # if failed port 443 type tcpssl protocol http # with timeout 15 seconds # then restart # # 监控apache # check process apache with pidfile /var/run/apache2.pid start program = "/etc/init.d/apache2 start" stop program = "/etc/init.d/apache2 stop" # apache吃cpu和内存比较厉害,额外添加一些关于这方面的监控设置 if cpu > 50% for 2 cycles then alert if cpu > 70% for 5 cycles then restart if totalmem > 1500 MB for 10 cycles then restart if children > 250 then restart if loadavg(5min) greater than 10 for 20 cycles then stop if failed host www.example.com port 8080 protocol http then restart if 3 restarts within 5 cycles then timeout group server # 可选,依赖于nginx depends on nginx # # 监控spawn-fcgi进程(其实就是fast-cgi进程) # check process spawn-fcgi with pidfile /var/run/spawn-fcgi.pid # spawn-fcgi一定要带-P参数才会生成pid文件,默认是没有的 start program = "/usr/bin/spawn-fcgi -a 127.0.0.1 -p 8081 -C 10 -u userxxx -g groupxxx -P /var/run/spawn-fcgi.pid -f /usr/bin/php-cgi" stop program = "/usr/bin/killall /usr/bin/php-cgi" # fast-cgi走的不是http协议,monit的protocol参数也没有cgi对应的设置,这里去掉protocol http即可。 if failed host 127.0.0.1 port 8081 then restart if 3 restarts within 5 cycles then timeout group server depends on nginx |
start和stop的program参数里的命令必须是全路径,否则monit不能正常启动,比如killall应该是/usr/bin/killall。
对于spawn-fcgi,很多人会用它来管理PHP的fast-cgi进程,但spawn-fcgi本身也是有可能挂掉的,所以还是需要用monit来监控spawn-fcgi。spawn-fcgi必须带-P参数才会有pid文件,而且fast-cgi走的不是http协议,monit的protocol参数也没有cgi对应的设置,一定要去掉protocol http这项设置才管用。
进程多次重启失败monit将不再尝试重启,收到这样的通知邮件表明系统出现了严重的问题,要引起足够的重视,需要赶紧人工处理。
1 2 3 4 | download: google-chrome-stable_current_i386.deb sudo dpkg -i google-chrome-stable_current_i386.deb sudo apt-get install -f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | $ getconf LONG_BIT //操作系统位数 32 $ lsb_release -a //操作系统信息 No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 12.04.3 LTS Release: 12.04 Codename: precise 1 sudo apt-get purge openjdk* 2 Download : jdk-7u51-linux-i586.tar.gz 3 sudo su 4 mkdir /usr/lib/java 5 cd /usr/lib/java 6 mv ~/Downloads/jdk-7u51-linux-i586.tar.gz . 7 tar xvf jdk-7u51-linux-i586.tar.gz 8 mv jdk1.7.0_51/ java-7-sun 9 vi /etc/environment 10 PATH="/usr/lib/java/java-7-sun/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games" JAVA_HOME="/usr/lib/java/java-7-sun" CLASSPATH="/usr/lib/java/java-7-sun/lib" 11 source /etc/environment 12 java -version 13 Done |
1 2 3 4 5 6 | 1 Download Tomcat 2 vi ~/.bashrc 3 export CATALINA_HOME=<apache-tomcat path> export PATH=$PATH:$CATALINA_HOME/bin 4 Done |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | # !/bin/bash # Usage: tomcat [start|stop|reload|restart] # export JAVA_HOME= export CATALINA_HOME= export PATH=$PATH:$JAVA_HOME/bin:$CATALINA_HOME/bin export PATH=$PATH:$HOME/bin export BASH_ENV=$HOME/.bashrc export USERNAME="root" case "$1" in start) echo -n "tomcat start: " cd $CATALINA_HOME/bin/ sh startup.sh echo "Done" ;; stop) echo -n "tomcat stop:" cd $CATALINA_HOME/bin/ sh shutdown.sh echo "Done" ;; restart) $0 stop $0 start ;; *) echo "Usage: tomcat [start|stop|reload|restart]" exit 1 esac exit 0 |
1 2 3 4 5 6 | check process tomcat with pidfile /var/run/catalina.pid start program = "/etc/init.d/tomcat start" stop program = "/etc/init.d/tomcat stop" if 9 restarts within 10 cycles then timeout if failed url http://127.0.0.1:8080/ timeout 120 seconds for 5 cycles then restart |
catalina.pid
1 2 3 | vi $CATALINA_HOME/bin/catalina.sh Add Line: CATALINA_PID=/var/run/catalina.pid |
1 2 3 4 5 6 7 8 | monit -h
monit -d 30
monit start tomcat || monit start all
monit status
CHECK: http://localhost:2812
monit stop tomcat
|
Code解释大法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | package main import ( "fmt" "crypto/sha1" "crypto/md5" ) func main () { var TestString string = string("admin") Md5Inst := md5.New() Md5Inst.Write([]byte (TestString)) Result := Md5Inst.Sum([]byte("")) fmt.Printf("%x\n",Result) Sha1Inst := sha1.New() Sha1Inst.Write([]byte (TestString)) Result = Sha1Inst.Sum([]byte("")) fmt.Printf("%x\n",Result) } |
和JUnit差不多
太差了!!!
打包源代码进行分发,使用者自行编译!!!
现在严重鄙视各种自行编译!!!(突然想起了python,嘿嘿)
反射是高配置,灵活性的体现,但确实是把双刃剑。 高配置,灵活性 === 繁琐,可读性,可维护性 差
Go 中的反射,主要是两个概念Type和Value
用书中的例子,记录下
//传值
1 2 3 | var x float64 = 3.4 v := reflect.ValueOf(x) v.Set(4.1) |
//传址
1 2 3 4 5 6 7 8 9 10 11 12 | var x float64 = 3.4 p := reflect.ValueOf(&x) fmt.Println("Type of P : ", p.Type()) fmt.Println("settability of p : ", p.CanSet()) v:= p.Elem() fmt.Println("settability of v : ", v.CanSet()) v.SetFloat(7.1) fmt.Println("v : ", v.Interface()) fmt.Println("x : ", x) |
1 2 3 4 5 | Type of P : *float64 settability of p : false settability of v : true v : 7.1 x : 7.1 |
Cgo 没啥兴趣
没啥东东,interface{}? goroutine? defer糖?
还需要多发展发展,不知道以后在Android 上怎么样,给条活路哈
打包,发布啥的,这都什么年代啦... MakeFile?
现在就觉得做个MessageQueue 啥的还行,和Erlang拼杀?
Done
]]>额滴肺啊
现在满脑子都是:如何快速赚钱,提前退休, 逃离帝都啊!!!
哎 是时候去研究下 中彩票大法了
--帝都苦逼青年一枚
刚看到了快递小哥,想到:真是用生命来送快递啊
]]>额滴肺啊
]]>func Dial (net, addr string) (Conn, error)
1 2 3 4 5 6 7 8 | //TCP conn, err := net.Dial ("tcp", "192.168.0.1:9000") //UDP conn, err := net.Dial ("udp", "192.168.0.1:9000") //ICMP conn, err := net.Dial ("ip4.icmp", "192.168.0.1:9000") |
func Listen(net, laddr string) (l Listener, err os.Error)
1 2 3 4 5 6 | service := ":1200" //TCP conn, err := net.Listen("tcp", service) //UDP conn, err := net.Listen ("udp", service) |
I THINK: It is cool !
func (c *Client) Get (url string) (r *Response, err error)
func (c *Client) Post (url string, bodyType string, body io.Reader) (r *Response, err error)
func (c *Client) PostForm (url string, data url.Values) (r *Respone, err error)
func (c *Client) Head (url, string) (r *Response, err error)
func (c *Client) Do (request *Requset) (resp *Response, err error)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | //从网上贴的,偷懒啦~ package main import ( "net/http" "io/ioutil" "fmt" ) func main() { client := &http.Client{} reqest, _ := http.NewRequest("GET", "http://www.baidu.com", nil) reqest.Header.Set("Accept","text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8") reqest.Header.Set("Accept-Charset","GBK,utf-8;q=0.7,*;q=0.3") reqest.Header.Set("Accept-Encoding","gzip,deflate,sdch") reqest.Header.Set("Accept-Language","zh-CN,zh;q=0.8") reqest.Header.Set("Cache-Control","max-age=0") reqest.Header.Set("Connection","keep-alive") response,_ := client.Do(reqest) if response.StatusCode == 200 { body, _ := ioutil.ReadAll(response.Body) bodystr := string(body); fmt.Println(bodystr) } } |
Client Side代码来源网络
net.Listen
func ListenAndServe(addr string, handler Handler) error
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | package main import ( "io" "log" "net/http" ) func helloWorld (w http.ResponseWriter, r *http.Request) { io.WriteString (w, "Hello World") } func main () { http.HandleFunc ("/hello", helloWorld) err := http.ListenAndServe(":8099", nil) if err != nil { log.Fatal(" ListenAndServe: ", err.Error()) } } |
func ListenAndServeTLS(addr string, handler Handler) error
这个有必要单独出来,是做逻辑与页面分离的。
templates := make(map[string]*template.Template)
需要自己写个Handler函数,好烂
总结 Socket 喜欢! Http 对于习惯Java 编程的我,不喜欢!
是否可以理解为实现动作即实现接口...
1 2 | var rw IReadWriter = ...
var r IReader = rw
|
应用场景
1 2 3 4 5 | //拥有IReader接口的开发者,想要知道IReader对应的实例化是否也实现了IReadWriter接口,这样它可以切换到IReadWriterj接口的Writer()方法 var reader IReader = NewReader() if writer, ok := reader.(IReadWriter); ok { writer.Writer() } |
个人觉得有点继承的意味
Go 的接口定义的很宽泛,有点太宽泛了
以共享为手段的
订阅/分发,Ajax,Erlang Message Queue
协程(co-rountine),在Go中叫go-rountine,轻量级的线程,由Go运行时管理(Run time)
不要通过共享内存来通信,而应该够过通信来共享内存
协程,也有人称之为轻量级线程,特点
和线程对进程的解释有啥不一样呢?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | var channelName chan ElementType ch := make(chan int) //写入数据 ch <- value //读出数据 value := <-ch //单向channel var ch1 chan int //ch1是一个正常的channel,不是单向的 var ch2 chan<- float64 //ch2单向的,只用于写入float64数据 var ch3 <-chan int //ch3单向的,只用于读取int数据 //关闭channel close(ch) //测试是否关闭成功 x, ok:= <-ch if ok == false 则表明ch已经被关闭 |
同步锁
全局唯一性操作
敲了书中的代码,越来越觉得Go很像C语言,普遍的指针使用,channel 还要继续深入看看。
struct 结构体代替class到是没什么,就是觉得比起Java来说,代码的组织和条理还是有点乱,看来还得适应适应。
Go 的调试需要gdb.exe,cywin 和 mingw 的都一样,装一个配在DEBUG里面就好了。 不过最好是保持32位/64位一致,要不有可能会凌乱掉。
Go Project中的Build Path最好不要动,这个更容易凌乱。
]]>经过一天的苦苦奋斗,总结下经验:
GO语言编程
第一天,个人感觉
变量描述:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | //int var a int a = 1 b := 1 // string c, str := 1, "string" // Array var is [2]int is[0] = 1 is[1] = 2 a := [2][2]int{ {1,2}, {3,4} } var myArray [10]int = [10]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} // 切片 var mySlice []int = myArray[:5] var slice1 = []int{1, 2, 3, 4, 5} // 结构体 type PersonInfo struct { ID string Name string Address string } // map var personDB map[string] PersonInfo personDB = make(map[string] PersonInfo) personDB["12345"] = PersonInfo{"12345", "Tom", "Room 203"} personDB["1"] = PersonInfo{"1", "Jack", "Room 103"} person, ok := personDB["1234"] if ok { fmt.Println("Found Person ", person.Name, "with ID 1234 .") } else { fmt.Println("Do not found Person with ID 1234 .") } monthdays := map[string]int{ "Jan":31,"Feb":28, }// 最后一个逗号是必须的。 fmt.Println("---- ", monthdays) // ---- map[Jan:31 Feb:28] |
Golang中函数的基本组成为:关键字func,函数名,参数列表,返回值,函数体和返回语句
个人觉得这里得package name就是调用名,即:
Golang中的泛型 interface{}, 先这么理解吧
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | func MyPrintf(args ...interface{}){ for _, arg := range args { switch arg.(type) { case int: fmt.Println("int value") case string: fmt.Println("string value") case int64: fmt.Println("int64 value") default: fmt.Println("noknown value") } } } |
1 2 3 4 5 6 7 | func example (x int) int { if x==0 { return 5 } else { return x } } |
书上写Will throw: function ends without a return statement
但是我运行了下没出错啊... 是否是在说尽信书不如无书,哈哈
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | package main import "fmt" func example (x int) int { if x==0 { return 5 } else { return x } } func main() { fmt.Println("Hello, 世界", example(4)) fmt.Println("Hello, 世界", example(5)) } |
1 2 | Hello, 世界 4 Hello, 世界 5 |
这个规则让人只想说shit
GO语言编程
]]>Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
个人而言:Google出品的,应该很NB吧...
在Go 主页上Download Go, 我选择的msi版本,双击安装。
安装完成后,启动cmd
,运行go
查看安装成果
1 2 3 | go echo %GOROOT% # C:\go echo %PATH% # %GOROOT%\bin;%PATH% |
Eclipse是个好同志,啥都有插件 Go的Eclipse插件goclipse,如何安装eclipse的插件俺就不废话了
配置如图:
在Eclipse创建Go Project
在Go Project创建Go File
键入
1 2 3 4 5 6 7 | package main
import (
"fmt"
)
func main() {
fmt.Println("hello world")
}
|
右键运行Run Go Application
$GOHOME
目录下并解压,得到Gocode路径C:\Go\gocode\gocode-master
在Gocode文件夹下,起cmd
,键入1 2 | set GOPATH="C:\Go\gocode\gocode-master" go build |
在Gocode下,会生成gocode-master.exe
配置Gocode到Eclipse 如图:
效果展示
Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
]]>